Blog Spam

Blog spam is so, so, so annoying. I looked through the server logs and I have over 3200 requests to create a comment on a single blog entry of which none have gone through. So it’d seem like my CAPTCHAs have saved me the hassle of having to delete 3200 comments. And that’s just on one blog entry.

A lot of people use Akismet to deal with their spam needs. We don’t use this at the moment because Geneone lacks the support for Akismet and the CAPTCHAs seem to be a pretty good measure for stopping spam.

Even though I have CAPTCHAs, I still get blog spam. This started off several months ago and tended to consist of posts with a ton of links and junk in HTML and BBCode. I suppose the idea is that the blog software would support at least one of those input languages. This kind of obvious spam has more or less stopped.

Non-Obvious URL Link Spam 

Since then, I’ve been getting up to 3 spams on a bad day. The difference is, these commenters will often take the time to write something such as "Nice script, works in all the major browsers, I’m going to implement this on my site. Thanks!" It’s quite obvious that it’s spam as they will specify their name as "Trucks" and link to a website which just sells a ton of trucks. 

Sometimes people will take the content of an existing comment and paste that as a new comment to save them from having to write them. But they’re dead easy to spot.

CAPTCHA Sweatshops 

Even though I have CAPTCHAs, I still get spam. Why? Most likely because a human has been paid to sit there and to spam websites. This is when spammers employ people in developing countries a few cents an hour to go online and to solve CAPTCHAs. Solving 12 CAPTCHAS a minute, all day long.

An article on the Guardian website a few days talked about people in developing countries being paid to spam blogs.

Whilst we’re on the topics of blog spam, check out the blog entry from a while back about why nofollow is the wrong solution to blog spam. 

4 thoughts on “Blog Spam

  1. The simple way round that, is to have English CAPTCHAs. Something that requires a good knowledge of English. Sure it means those who can’t speak English too well won’t be able to post… but if they don’t know English well enough to answer the CAPTCHA, then is their comment really worth having?

     

    Or topical questions would work, but they can be regional specific (eg. something that makes the news in USA, may not in the UK). 

  2. I’ve started using sums as my Captchas. Just simple stuff at first, but if the spammers crack that I’ll make them more difficult. It’s only text at the moment, which means that the sums can be quite easily solved by a spider, but the spider/spammers first need to work out that they have to do the sum rather than just enter the text in. 

  3. Unfortunately the odds are that there will be very few english-specific questions that you can ask, and your average spammer can probably enumerate those more cheaply than farming out CAPTCHAs.  (What an awful acronym, btw.  Why not "Human or Fake Filters" or HOFFs?)

    Perhaps we need some reputation-based system, so that a person’s own signal-to-noise ratio must be high enough for them to be heard.  Such a system should, for preference, be anonymous both in relation to who a person is and to what else they do online.  We make a default assumption that anyone below a certain threshold is a spammer, and occasionally have a look and shove a few points in the direction of anyone who isn’t.  If basic reputation-based cash doesn’t work, then we could make use of algorithms which not only allow users to make ratings, but also feed back the same kind of ratings that they give, which makes such a system very hard to subvert.  IIRC these exist, though I’m surprised that they’re not all over the filesharing networks yet.

  4. "I’ve started using sums as my Captchas. Just simple stuff at first, but if the spammers crack that I’ll make them more difficult. It’s only text at the moment, which means that the sums can be quite easily solved by a spider, but the spider/spammers first need to work out that they have to do the sum rather than just enter the text in. " – James

    This has another added bonus – it increases practice of arithmetic!

     You should include some light algebra as well, or it may get boreing 😀

Leave a Reply

Your email address will not be published. Required fields are marked *