Can blog spam be of any real use to security teams? Here’s my take on turning a piece of what some consider internet background noise in to information ripe to becoming actionable intelligence.
I get waves of blog spam – comments that posted to a blog site advertising someone else’s wares (including links to malware!), services or attempts to increase search engine rankings – to my small corner of the internet at infrequent cycles. To many of my fellow blog owners this is a source of constant annoyance, but for me I get a little, gleeful smile and promptly dump the user agent , body text (extracting any embedded URLs), and posting IP address in to my pile of “all things to observe and search on”.
Once carefully added in to my speed optimized database*, I then sort it, note the duplicate posts and do some passive look-ups on free resources, such as the Internet Storm Center (ISC) IP lookup tables , to see if it’s a known or reported as malicious/bad. I then pipe the IP address, domains and URLs into a local copy of Collective Intelligence Framework (CIF), regardless if the passive searching didn’t yield any information, to see if anyone else has run into it. For those unfamiliar with CIF, fellow Handler, Russ McRee, did a nice write up on the basics  on the Collective Intelligence Framework (CIF) by Wes Young . CIF pools data from numerous sources and can quickly help identify if any of the collected data points to botnets, infected systems, malware hosts, etc. All of which is an huge informational leap up from an annoying automated posting with an IP address and URL.
With the results from those searches completed, I can then compare those results back to historical data or logs from other sources (firewalls, proxy logs or spam filters ). All of this is automated via some ‘internet researched’ code - poorly shunted together by yours truly. After any matches and final results are spat out, it allows me to then make decisions whether to add the IP address, net block, user agent or URL to a block or monitor list. I’m not a fan of trusting my scripts or intelligence feeds to be completely accurate for automatic blocking IP ranges, but don’t worry so much on pushing alerted URLs in to the Suspicious category on web proxy system. I’ve found my human web surfing anomaly detection systems are really good at ring up and moaning if we, that’s the Royal We (meaning me), accidently blocks Google.
If you want to go to visually to town with the data, pop the resolved spamming IP addresses in to a geo-IP, the ISC has a page to help with that  and show friends, family or the management where the bad IP addresses live. Who says the whole family can't enjoy an evening of PowerPoint together, listing the towns, cities and countries that spam your blog sites. Surely that beats watching re-runs of some random TV show?
All this possible intelligence from humble blog spam, so what could you do with that data?
As ever, feel free to pitch in any thoughts or comments.