Mat Balez

Adventures in Spam: Part I

We here at Defensio HQ see a lot of spam; spam in all its flavors and incarnations. Occasionally we see new techniques that baffle the mind. URL-less spam (that is, spam not containing URLs) is one of these baffling new forms of spam we’ve seen cross our desk, so puzzling that it’s worth delving in to try to understand what in the world it means.

Example

URL-less spam looks like the following:A spam comment without any url

Notice that this commenter (i.e. spammer) has not left a URL with his/her credentials, nor has he/she supplied any URLs in the body of the comment.

The Issue

Why is this strange? Because the entire reason spammers typically hit blogs with their bogus comments is to populate the web with URLs that link back to their spammy sites, and thus manage to exploit the Google juice of the sites they breach with the goal of boosting their own search engine rank. And so, bombarding a blog with comments that do not contain URLs defeats the whole purpose, and results in no obvious net benefit to the spammer, other than the evil satisfaction of annoying the hell out of bloggers.

Motives

So if not to exploit Google juice, why do spammers go with a URL-less approach? Two theories:

1) To “train” spam filters to allow specific keywords.

Filters that use statistical filtering learn over time. By having legitimate-looking comments make it through the filter, while containing a handful of specifically-chosen keywords, spammers could be trying to tip statistical filters toward starting to consider such keywords as innocent, thus increasing the likelihood that future spam comments containing these words will bypass spam defenses.

2) To be whitelisted.

Some spam filters allow users that successfully post comments X number of times to be added to a whitelist, meaning they will bypass the filter in the future. Since URL-less spam typically looks fairly normal, spammers hope that bloggers will fail to identify their comment as spam enough times that auto-whitelisting might kick in.

These motives are simply our best guesses at what might be in spammers’ nefarious minds. Who knows, simple annoyance could be their sole, inexplicable, goal?

tags:
  1. 59 Responses to “Adventures in Spam: Part I”

  2. Most of these are to get the IP white listed as you pointed out.

    I have noticed a few posts coming in like this. Now they can go into old posts and add their links at will.

  3. Are the follow-ons still being marked as spam by Akismet and its ilk at this point, for other message characteristics? Or have we yet seen any followups to these messages?

    It is a bit difficult to get into the mind of a spammer, I think, unless you are a spammer. I don’t even think I want to be in there. ;o)

  4. On my blog, the comments are getting past all anti-spam options, once they have been approved.

    I do not know if most of the blank posts have been automated yet. but it is very simple to do by hand. So I am sure that it has been automated and we will see a lot more of this.

    As an example, was this comment already approved?

  5. Mark,

    Defensio caught the comment in the screenshot as spam. We’ve been doing pretty good with those.

  6. Michael, we’re actually having a lot of fun following them. You wouldn’t believe the kind of tricks they come with ;-)

  7. I always suspected that they were footprints left behind to mark a blog as open to comments. A follow up at a later date would drop the drive by links.

  8. I agree with John Andrews - It seems that they are testing the waters - and then they compile a list of sites they can hit.

    We started seeing these over @ http://www.askTheAdmin about a month ago. We turned off being able to anonymously post for a few days and then de-activated it… No more automated blog spam!

  9. Good insight. I came here via Digg, your article made it to the front page
    http://digg.com/tech_news/Are_You_Seeing_this_New_Kind_of_Blog_Spam
    PS: you may want to change the colour scheme, it is hard to read on low contrast displays

  10. Another thing I’ve noticed is that oftentimes blog spam doesn’t contain actual URLs in the body of their comment, but they do in the “Website” field. Keeping the URL out of the comment area gives them a higher likelihood of passing the spam filters, and the URL may still be linked in from their name.

  11. The name in the comments on blogs such as this is the person’s website… could that effect the results on google and such? If so, that would be a possible reason.

  12. Just at a CAPTCHA and there will be no problems.

  13. Are you using Akismet to protect against the kind of links that I just did? I think ultimately the intention should be to get relevant comments which contribute to your piece. If they are fleeting and useless affirmations like the example you provided they need to get banned, but a lot of real comments by real industry pros are legitimate, but we also use the SEO techniques to benefit our own sites while we are at it. I view it as one hand washes the other, it’s really the non-contributors and auto-bots we need to crush.

  14. I have noticed this also, it’s rather annoying.

  15. HostFX: Spammers know how to bypass captchas nowadays. Of course, it will reduce the amount of spam , but it doesn’t eliminate it.

  16. It’s for whitelisting and then return link dropping.

    If they use the same line(s) in the blogs they can do a google search on their string to see which have been indexed, which prompts them to return and drop in their URLs.

    Markus Diersbock

  17. local business advertising:

    1st, good try at triggering our spam filter ;-)

    2nd, we’re not using Akismet. You must not know about Defensio; we’re a new spam filtering web service that can be used on blogs and web applications. Of course, we eat our own dog food ;-)

  18. Referring to my post above…

    If you do a google search on “a true resource, and one many people clearly enjoy” it returns 741 hits.

    Markus Diersbock

  19. Spam filtering is futile. I hardly get any comment spam (between 4 and 8 a day) because I *report* each incident which results often in the spamvertized sites being taken down.

    Instead of focusing on 1001 filter tricks it would be nice if people could develop something similar to spamcop (hint hint).

    Comment spam has several weaknesses that makes it easier to follow up to abuse reports compared to email spam. Yet it saddens me how many developers focus on filtering, IMNSHO the wrong way to go.

  20. indeed, i encountered a big problem with these on a blog i administer. it was a solid few months ago, but as the earlier commenter mentioned, a very simple captcha has foiled them.

    –adam

  21. John: I hate to say that but you probably don’t get a lot of spam because your site doesn’t have a lot of Google juice.

    Some of our testers get over 2000 spam comments per day: Spam is a huge problem for them.

  22. This is why I constantly delete comments that seem suspicious. Didn’t know the reasoning behind it, however.

    I’m getting a lot of spammer trackbacks as well.
    Trackbacks don’t get caught in the comment spam filter, but still give a link back to the original blog.

  23. Call me a geek, but these tactics gave me goosebumps. Thanks for analyzing this. I’ll be sure to keep a look out for that in my blog site.

  24. These comments still contain an URL, so it does add inbound links. Only a single one, but the generic post is easy to add to thousands of sites quickly…

  25. Is it me or are some of the comments to this post reflective of the problem described in the article? E.g., the one right above me by Blog Spam at Lenwood?

    Or is that some kind of ‘Pingback’, which is a blogosphere term I’ve heard but never fully understood (other than a way to boost your Google ‘juice’, as you call it).

  26. Dude the URL is embedded in the signature, the byline, isn’t it? Rather than the body of the message?

  27. Vasper/Jason: They don’t contain a URL at all. What you are seeing is the Defensio interface which might be a little different from what you’re used to see.

  28. Wow, this is weird, then. Thanks so much for exposing it. :^)

    Am going to Twitter this valuable post.

  29. Or maybe it’s in order for bots to pick up email address’ when the email listed is hit by a bot that has hijacked another email address it can log the email in return send more spam to the spammers!

    Talk about letting things come to you!

  30. I feel for you guys, really. I have never owned a blog myself, but do know how much of a problem spam can be with blogs alone. I do so dislike hearing of my friends blogs being trashed with spam. I have never heard of this service, but I will recommend it to them and see what they think (of course, they have only heard of askimet as well). If your system is catching these URL’less spam comments, you may very well be the next thing in spam catching techniques.

  31. There is another reason for it. Have you checked their profiles? Do they list a website on it? Maybe it’s to keep there comments AND have a link to their URL in their profile which still allows them to exploit google without being deleted.

  32. Drew,

    Thanks for the kind words. Obviously, we think we’re the next best thing in spam fighting. But I might be biased a tidy bit ;-)

  33. Don’t we all hate spammers! urk!

  34. This is bizarre. I have never seen this on any blog. Still hard to believe that not having a link will help you though… Interesting perspective!

  35. My solution to this is simply to only let comments of any significant value through. Positive comments, as well as negative ones, never make it to the page. I’d rather visitors get a high content without the noise.

  36. Someone asked if the address in the name counts toward google and it does. I search through robotic lawn mower blogs and post, answering any questions that have been commented and unanswered and then leave the url back to our site if anyone else has questions. I always use my name in the name field and google now shows my name as the 3rd most common link back to our website. So yes, google and yahoo both count that as a link. I came here because I saw this up on Digg and was curious enough to click on it.

  37. How can this method be used to gain link juice to a spammy website? Pretty much every blog system on the internet uses the rel=”nofollow” tag in links. Google passes no link juice/trust through links with no follow.

  38. Coldfire:

    Not all blogs do. Plus, it’s rumored that Google still follows them.

    If spamming blogs didn’t work, they wouldn’t do it.

  39. I have yet to see any of this kind of spam on my blog.
    However my lack of good content could be to blame for this.

  40. Another reason they might be doing this is to see if the comment gets cleared up (ie, are the comments actively policed).

    If not they might come back a bit later and post a bunch of links to pr0n, v1agra and p0ker sites…

  41. Thank you for the great article — a true resource, and one that many Diggers clearly enjoy. ;-)

    (Just testing your filter…)

  42. Honestly, though — this is nothing a simple captcha wouldn’t fix. I know that you said that spammers know how to get past them, but who doesn’t?

    Though, simply knowing that you need a human to get past them doesn’t make them any less effective. I used to get tons of spam on my blogs running Movable Type — I started using captchas about two years ago, and I haven’t got _one piece of spam_ since.

  43. Don’t feel bad if you’re not getting spam, like your blog isn’t good or popular enough to attract spambots. LOL

    I advise against CAPTCHAs, except in cases of spambot storms where they’re swarming you and make it time-consuming to moderate and delete them all.

    Comment moderation with delayed posting is fine. We can wait a few hours or a day to see our precious user generated content appended to a blog post!

  44. And please…

    Change your colors here. The small white type on black is nearly impossible to read. Am using Firefox, normal resolution.

    Reverse type on black is good only for large headlines, not body copy.

    :^)

  45. Yeah, CAPTCHAs can reduce or eliminate spam comments, but in many cases I think you’re also do the same to legit comment posters.

    I have seen CAPTCHAS on Blogger, TypePad, and other platforms that were so skewed, I had to try 3 or 4 times, or more, to see and type in the correct characters.

    Very few blog visitors will tolerate such difficulty.

    Comment moderation with delayed posting, I insist, is the way to go, except during spambot storms.

  46. the colours are fine vaspers. you’re just not straight in the head.
    I’ve been wondering for awhile about these posts too. weird stuff. I think it’s just mind boggling.

    oh and btw..

    Thank you for the great website - a true resource, and one many people clearly enjoy.

    Cheers!

  47. Interesting read, I generally delete comments that do not make sense.good post though - I will definitely keep some of the stuff in mind when dealing with comments.

  48. Is that is what happening when an email comes through that has absolute gibberish then? are they trying to train the spam filters? How is that possible?

  49. I never did think that the spammers were that clever to post stuff just to get enough comments approved thru the spam filters then they can cut loose on other blogs. I guess this will be a battle that will last a long time. I know on my blog I get about 1000 spam comments that get busted compaired to one that gets approved.

  50. I don’t get it; If you’re doing whitelisting at some point, are they allowed to post urls in the comment field? I don’t see any benefit here if you systematically forbid urls in comment field. Am I missing something here?

  1. 10 Trackback(s)

  2. Sep 26, 2007: rewardless.com » If only spammers used that energy for good. . .
  3. Sep 26, 2007: Cartoons Plugin » Blog Archive » a-ko fan art teen titans vs ranma Are You Seeing this New Kind of Blog Spam?
  4. Sep 26, 2007: General blog info « Tricycle Editors’ Blog
  5. Sep 26, 2007: Blog Spam at Lenwood
  6. Sep 26, 2007: SearchRoads » robot hell lyrics futurama Are You Seeing this New Kind of Blog Spam?
  7. Sep 27, 2007: A lab of experiments and design by Jerlyn Thomas
  8. Sep 28, 2007: UngsungBlog: A (Fantasy) Sports Blog » The Office Season Premiere (Warning: spoilers!), Week 3 (Fantasy) Football Thoughts, And URL-Less Spam
  9. Sep 29, 2007: TerrieMiller.com » Blog Archive » links for 2007-09-29
  10. Sep 29, 2007: Having Fun With More Advanced Blog Comment Spam - Cape Cod SEO
  11. Oct 10, 2007: Teh Blarg » Beta Testing Defensio

Post a Comment

Subscribe to our RSS feed

Stay up to date with everything Defensio!

Search this blog