Since I started using a WordPress blog back in 2005, I’ve always had the Akismet WordPress plugin installed, and it was the sole provider of my spam protection. It has done an awesome job with an at-this-moment 99.843% accuracy rating, and has blocked 21,215 spam comments of which 6,686 of them were just in the last 6 months.
About a week ago, I found an additional spam blocking plugin that has also been very helpful. This one is called Bad Behavior.
In my observations over the last couple months, it appears that Akismet will block a comment that doesn’t seem to have any correlation to the content of the blog post. This would be why you see posts in your Spam queue that contain no links, no really harmful URLs, and just random text or pointless statements in the body of the comment. I’m sure Akismet is much more complicated than that, though, and I would assume there is a backend database of known spamming IPs/Hosts out there that it may also check against. However, the simplest, and likely initial method of detecting spam is via content.
Not with Bad Behavior. Instead of checking the content of the spam, it looks at the stuff you can’t see – the HTTP Headers, IP, User-Agent String, etc. From their own website…
Bad Behavior analyzes the HTTP headers, IP address, and other metadata regarding the request to determine if it is spammy or malicious. This approach has proved, as one user said, “shockingly effective.†After all, spammers write their bots on the cheap, and have little incentive to code very well. If they could code very well, they probably wouldn’t be spammers.
When Bad Behavior looks at a request, it determines if the request matches a profile of known malicious or spammy activity, and falls outside the bounds of a normal human browsing the web. If so, the request is blocked. But a way out is provided for any human beings with unusual configurations or viruses/Trojans on their computer who may be blocked.
Source: How Bad Behavior Works
Here’s an example of some of the content it has blocked from this very blog…
The image above is using a User-Agent string that includes the Windows version “Windows XP”. Anyone who has done their homework, and makes up a User-Agent string knows that Windows XP is actually Windows NT 5.x where X is the Service Pack number applied. Since Windows XP is not a valid User-Agent String (even though they went to so much trouble to include all the other information in the header), it was blocked.
With this image, the plugin saw that the header was missing the “Accept” statement, telling the server receiving the request what types of files it was willing to accept as a response. Most of the attempts to bot-post that I have seen blocked in the past week or so have been this type of error.
According to the Bad Behavior Benefits and Features page, the plugin runs before any of your PHP-based software (yeah, that’s right, it is available for any PHP-coded site, not just WordPress blogs), so your server never has to respond to a bot just “harvesting data and delivering junk.” Instead the bot is given some 400-style error, and never gets a response from your site.
There are more features and settings that I haven’t had a chance to play around with yet, but if I find it necessary, I’ll create an additional post or add them to this one. I recommend this plugin to go alongside any other spam protection you have in place on your form-driven website or blog.