This, and a handful of simple firewall rules in the raw table can block about 90%+ of that remaining 1% just looking at the spoofable banner that none of the bots seem to spoof I assume due to being lazy like me.
In the raw table:
-A PREROUTING -i eth0 -p tcp -m tcp --dport 22 -d [my server ip] -m string --string "SSH-2.0-libssh" --algo bm --from 10 --to 60 -j DROP
-A PREROUTING -i eth0 -p tcp -m tcp --dport 22 -d [my server ip] -m string --string "SSH-2.0-Go" --algo bm --from 10 --to 60 -j DROP
-A PREROUTING -i eth0 -p tcp -m tcp --dport 22 -d [my server ip] -m string --string "SSH-2.0-JSCH" --algo bm --from 10 --to 60 -j DROP
-A PREROUTING -i eth0 -p tcp -m tcp --dport 22 -d [my server ip] -m string --string "SSH-2.0-Gany" --algo bm --from 10 --to 60 -j DROP
-A PREROUTING -i eth0 -p tcp -m tcp --dport 22 -d [my server ip] -m string --string "ZGrab" --algo bm --from 10 --to 60 -j DROP
-A PREROUTING -i eth0 -p tcp -m tcp --dport 22 -d [my server ip] -m string --string "MGLNDD" --algo bm --from 10 --to 60 -j DROP
-A PREROUTING -i eth0 -p tcp -m tcp --dport 22 -d [my server ip] -m string --string "amiko" --algo bm --from 10 --to 60 -j DROP
Adding the server IP minimizes risks of also blocking outbound connections as raw is stateless
I rarely do this any more given they rotate through so many LTE IP's. Instead I get the bot operators to block me by leaving SSH on port 22 and then giving them a really long VersionAdendum that seems to get the bots feeling broken, sticky and confused. There are far fewer SSH bot operators than it appears. They will still show up in the logs but that can be filtered out using drop patterns in rsyslog.
VersionAddendum " just put in a really long sentence in sshd_config that is at least 320 characters or more"
Try it out on a test box that you have console access to just in case your client is old enough to choke on it. Optionally use offensive words for the bots that log things to public websites. Only do this on your hobby nodes, not corporate owned nodes unless legal is cool with it, in writing.
I don't know if this is still the case, but -m string used to be resource intensive, because it has to parse each packet for the string before passing it on to other rules.
It can be. This this case however it is limited to eth0, tcp, port 22. If any of those don't match there will be no parsing and thus no impact. Another mitigating factor is that we are only looking at specific byte regions of the packet so parsing is minimized. On busy SFTP servers I would probably avoid using such rules if CPU load is becoming a problem. For most people this will not even register in htop or vmstat. There are also ways to use this string check in combination with ipset and/or xt_recent to minimize the times we see a packet from a bot. Here is an example using an IPSet called "bots" that we drop early on in the raw table and also use in the filter outbound rules to reset openssh trying to respond the first time we see the bad string so we close the socket earlier.
Anything I explicitly drop I do so in the raw table to keep them out of the state table. The state table is more CPU expensive especially at high packet rates and runs the risk of depleting the default state table limits especially for anything that now has a broken state on purpose like these poor lil bots. Since I brought it up, here is how to increase the state table limits.
# from /etc/sysctl.conf: increase state table limits.
# Requires 1/4 mem to hash table plus 400 overhead because I am the cargo culting king:
# cat /etc/modprobe.d/nf_conntrack.conf
# options nf_conntrack expect_hashsize=256400 hashsize=256400
net.nf_conntrack_max = 1024000
Should people use default state table memory allocations on a busy node, everyone can be locked out of it regardless of how many TB of RAM are free. The node can appear "down".
In the raw table:
Adding the server IP minimizes risks of also blocking outbound connections as raw is statelessI rarely do this any more given they rotate through so many LTE IP's. Instead I get the bot operators to block me by leaving SSH on port 22 and then giving them a really long VersionAdendum that seems to get the bots feeling broken, sticky and confused. There are far fewer SSH bot operators than it appears. They will still show up in the logs but that can be filtered out using drop patterns in rsyslog.
Try it out on a test box that you have console access to just in case your client is old enough to choke on it. Optionally use offensive words for the bots that log things to public websites. Only do this on your hobby nodes, not corporate owned nodes unless legal is cool with it, in writing.