Last Updated: 2015-06-16 14:25:57 UTC
by Johannes Ullrich (Version: 1)
Many web application firewalls do block odd user agents. However, decent vulnerability scanners will try to evade these simple protections by trying to emulate the user agent string of commonly used browsers. To figure out if I can distinguish bad from good, I compared some of the logs from our honeypots to logs from a normal web server (isc.sans.edu). Many of the top user agents hitting the honeypot are hardly seen on normal web sites, allowing me to identify possible vulnerability scanners.
First: There are a number of legitimate scripts that poll our data on isc.sans.edu. While for example "Python" is used by many vulnerability scanners, we do have a good number of python scripts using our APIs. I tried to eliminate some of these requests.
Odd legitimate user agents:
First lets start with a couple of odd user agents from our normal site:
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.247) Gecko/20100101 Firefox/17.247
Yes, the string "User-Agent:" is part of the user agent string. The version of Firefox is also old... (if legit at all. I don't have Firefox 17 around to verify). This user agent string is used by a web site uptime monitoring service. I assume the developer didn't quite understand how to set the user agent, and ended up with the extra "User-Agent:" text.
Mozilla/5.0 (compatible; MJ12bot/v1.4.5; http://www.majestic12.co.uk/bot.php?+)
I don't see any actual attacks from "Majestic", but they are certainly an aggressive bot. As explained on their site, you can download the bot and the goal is to build a distributed network of bot spidering web based content.
The following user agent strings are much more common in our honeypot then in our normal web site, indicating that these user agents are used by vulnerability scanners. However, these are (in some cases) legitimate user agents.
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0
An old version of Firefox. The #1 user agent right now in our honeypot. Firefox/8.0 does not show up in the top 1,000 user agents used on isc.sans.edu.
Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:37.0) Gecko/20100101 Firefox/37.0
#2 in our honeypot. Sure... there may be some people browsing the internet using Firefox 37 (a recent version) on Ubuntu. But certainly not your #2 most common browser. On our real system, this user agent comes in as #220.
#3 in our honeypot is masscan. Of course this is a safe to block vulnerability scanner.
Opera/9.80 (X11; Linux x86_64) Presto/2.12.388 Version/12.16
After some obvious bots (e.g Baidu), we got Opera, a browser that doesn't show up at all in the top 100 user agents used on our ISC website.
So what can you do with this information?
- Some blocking on the web application firewall is probably a good idea for tools like masscan. You may want to allow them if they are used by legitimate pentesters or vulnerability scans that you use to test your web applications.
- If some of these user agents have legit uses, but are more often used maliciously, use them for your log reviews. See what kind of requests you see more likely from odd (usually outdated) user agents . Many tools use a current user agent when they are created, but then the user agent is never updated so they end up with outdated user agent strings that start to "stick out" as most of your users upgrade.
- Decent web application firewalls will look for other artifacts, like header order, to verify the user agent. We also see user agents like Googlebot abused (see a prior diary about identifying fake google bots) .