Bots Searching for Keys & Config Files
If you don’t know our "404" project[1], I would definitively recommend having a look at it! The idea is to track HTTP 404 errors returned by your web servers. I like to compare the value of 404 errors found in web sites log files to “dropped” events in firewall logs. They can have a huge value to detect ongoing attacks or attackers performing some reconnaissance. Reviewing 404 errors is one task from my daily hunting-todo-list but it may quickly become unmanageable if you have a lot of websites or popular ones. The idea is to focus on "rare" events that could usually pass below the radar. Here is a Splunk query that I'm using in a daily report:
index=web sourcetype=access_combined status=404 | rex field=uri "(?<new_uri>^\/{1}[a-zA-Z0-9_\-\~]+\.\w+$)" | cluster showcount=true t=0.6 field=new_uri | table _time, cluster_count, cluster_label, new_uri | sort cluster_count
What does it do?
- It searches for 404 errors in all the indexed Apache logs (access_combined)
- It extracts interesting URI's. I’m only interested in files from the root directory eg. “GET /<name><dot><extension>”
- It creates “clusters” of common events to help in detecting rare ones.
Here is an example of output (top-20):
"_time","cluster_count","cluster_label","new_uri" "2017-07-18T13:42:15.000+0200",1,9,"/xml.log" "2017-07-18T13:18:51.000+0200",1,11,"/rules.abe" "2017-07-18T11:51:57.000+0200",1,17,"/tmp2017.do" "2017-07-18T11:51:56.000+0200",1,18,"/tmp2017.action" "2017-07-18T09:16:52.000+0200",1,23,"/db_z.php" "2017-07-18T07:28:29.000+0200",1,25,"/readme.txt" "2017-07-18T03:44:07.000+0200",1,27,"/sloth_webmaster.php" "2017-07-18T02:52:33.000+0200",1,28,"/sitemap.xml" "2017-07-18T00:10:57.000+0200",1,29,"/license.php" "2017-07-18T00:00:32.000+0200",1,30,"/How_I_Met_Your_Pointer.pdf" "2017-07-17T22:57:41.000+0200",1,31,"/browserconfig.xml" "2017-07-17T20:02:01.000+0200",1,76,"/rootshellbe.zip" "2017-07-17T20:01:00.000+0200",1,82,"/htdocs.zip" "2017-07-17T20:00:54.000+0200",1,83,"/a.zip" "2017-07-17T20:00:51.000+0200",1,84,"/wwwroot1.zip" "2017-07-17T20:00:50.000+0200",1,85,"/wwwroot1.rar" "2017-07-17T19:59:34.000+0200",1,98,"/rootshell.zip" "2017-07-17T19:59:27.000+0200",1,103,"/blogrootshellbe.rar" "2017-07-17T19:59:18.000+0200",1,104,"/rootshellbe.rar"
Many tested files are basically backup files like I already mentioned in a previous diary[2], nothing changed. But yesterday, I found a bot searching for even more interesting files: configuration files from popular tools and website private keys. Indeed, file transfer tools are used by many webmasters to deploy files on web servers and they could theoretically leave juicy data amongst the HTML files. Here is a short list of what I detected:
/filezilla.xml /ws_ftp.ini /winscp.ini /backup.sql /<sitename>.key /key.pem /myserver.key /privatekey.key /server.key /journal.mdb /ftp.txt /rules.abe
Each file was searched with a different combination of lower/upper case characters. Note the presence of ‘rules.abe’ that is used by webmasters to specify specific rules for some web applications[3]. This file could contain references to hidden applications (This is interesting to know for an attacker).
So, keep an eye on your 404 errors and happy hunting!
[1] https://isc.sans.edu/404project/
[2] https://isc.sans.edu/forums/diary/Backup+Files+Are+Good+but+Can+Be+Evil/21935
[3] https://noscript.net/abe/web-authors.html
Xavier Mertens (@xme)
ISC Handler - Freelance Security Consultant
PGP Key
Reverse-Engineering Malware: Malware Analysis Tools and Techniques | Amsterdam | Jan 20th - Jan 25th 2025 |
Comments
Posting a million obscene words might really annoy the hacker, to the point where they would respond with an all-out attack. Sigh.
But, the point is for them to *NOT* see a '404', and then to respond with further, but obviously fruitless, probing. This wastes their time, and filling their disk-drive with "noise".
Anonymous
Jul 19th 2017
7 years ago
No because most of my collected logs are coming from official websites and I don't want to pollute them with fake content.
Note that if you do this, their disk won't be always filled because some bots use HEAD requests instead of GET.
Anonymous
Jul 19th 2017
7 years ago
Good times, good times... :-)
Anonymous
Jul 19th 2017
7 years ago
It's like spam...lots of output for a little result. But ultimately at no cost to the perpetrators.
Anonymous
Jul 20th 2017
7 years ago