Bots Searching for Keys & Config Files

Published: 2017-07-19
Last Updated: 2017-07-19 06:26:44 UTC
by Xavier Mertens (Version: 1)
4 comment(s)

If you don’t know our "404" project[1], I would definitively recommend having a look at it! The idea is to track HTTP 404 errors returned by your web servers. I like to compare the value of 404 errors found in web sites log files to “dropped” events in firewall logs. They can have a huge value to detect ongoing attacks or attackers performing some reconnaissance. Reviewing 404 errors is one task from my daily hunting-todo-list but it may quickly become unmanageable if you have a lot of websites or popular ones. The idea is to focus on "rare" events that could usually pass below the radar. Here is a Splunk query that I'm using in a daily report:

index=web sourcetype=access_combined status=404
| rex field=uri "(?<new_uri>^\/{1}[a-zA-Z0-9_\-\~]+\.\w+$)"
| cluster showcount=true t=0.6 field=new_uri
| table _time, cluster_count, cluster_label, new_uri | sort cluster_count

What does it do?

  • It searches for 404 errors in all the indexed Apache logs (access_combined)
  • It extracts interesting URI's. I’m only interested in files from the root directory eg. “GET /<name><dot><extension>”
  • It creates “clusters” of common events to help in detecting rare ones.

Here is an example of output (top-20):

"_time","cluster_count","cluster_label","new_uri"
"2017-07-18T13:42:15.000+0200",1,9,"/xml.log"
"2017-07-18T13:18:51.000+0200",1,11,"/rules.abe"
"2017-07-18T11:51:57.000+0200",1,17,"/tmp2017.do"
"2017-07-18T11:51:56.000+0200",1,18,"/tmp2017.action"
"2017-07-18T09:16:52.000+0200",1,23,"/db_z.php"
"2017-07-18T07:28:29.000+0200",1,25,"/readme.txt"
"2017-07-18T03:44:07.000+0200",1,27,"/sloth_webmaster.php"
"2017-07-18T02:52:33.000+0200",1,28,"/sitemap.xml"
"2017-07-18T00:10:57.000+0200",1,29,"/license.php"
"2017-07-18T00:00:32.000+0200",1,30,"/How_I_Met_Your_Pointer.pdf"
"2017-07-17T22:57:41.000+0200",1,31,"/browserconfig.xml"
"2017-07-17T20:02:01.000+0200",1,76,"/rootshellbe.zip"
"2017-07-17T20:01:00.000+0200",1,82,"/htdocs.zip"
"2017-07-17T20:00:54.000+0200",1,83,"/a.zip"
"2017-07-17T20:00:51.000+0200",1,84,"/wwwroot1.zip"
"2017-07-17T20:00:50.000+0200",1,85,"/wwwroot1.rar"
"2017-07-17T19:59:34.000+0200",1,98,"/rootshell.zip"
"2017-07-17T19:59:27.000+0200",1,103,"/blogrootshellbe.rar"
"2017-07-17T19:59:18.000+0200",1,104,"/rootshellbe.rar"

Many tested files are basically backup files like I already mentioned in a previous diary[2], nothing changed. But yesterday, I found a bot searching for even more interesting files: configuration files from popular tools and website private keys. Indeed, file transfer tools are used by many webmasters to deploy files on web servers and they could theoretically leave juicy data amongst the HTML files. Here is a short list of what I detected:

/filezilla.xml
/ws_ftp.ini
/winscp.ini
/backup.sql
/<sitename>.key
/key.pem
/myserver.key
/privatekey.key
/server.key
/journal.mdb
/ftp.txt
/rules.abe

Each file was searched with a different combination of lower/upper case characters. Note the presence of ‘rules.abe’ that is used by webmasters to specify specific rules for some web applications[3]. This file could contain references to hidden applications (This is interesting to know for an attacker).

So, keep an eye on your 404 errors and happy hunting!

[1] https://isc.sans.edu/404project/
[2] https://isc.sans.edu/forums/diary/Backup+Files+Are+Good+but+Can+Be+Evil/21935
[3] https://noscript.net/abe/web-authors.html

Xavier Mertens (@xme)
ISC Handler - Freelance Security Consultant
PGP Key

4 comment(s)

Comments

Do you ever put "fake" content onto your web-server, e.g., a '/wordpress/' directory, and the '/wordpress/readme.txt' file, and fill that file with a million repeated strings, e.g., 'There is nothing to see here. These are not the droids that you are seeking.' ?

Posting a million obscene words might really annoy the hacker, to the point where they would respond with an all-out attack. Sigh.

But, the point is for them to *NOT* see a '404', and then to respond with further, but obviously fruitless, probing. This wastes their time, and filling their disk-drive with "noise".
Hi,
No because most of my collected logs are coming from official websites and I don't want to pollute them with fake content.
Note that if you do this, their disk won't be always filled because some bots use HEAD requests instead of GET.
Your comment reminds me of Tom Liston's Labrea. http://labrea.sourceforge.net/
Good times, good times... :-)
Hah, I see tons of this on my personal webserver still. Countless probes for Wordpress, phpMyAdmin, and other tools. Sometimes I get a 404 for a page and my response is "what the hell are they looking for?" so I have to Google the filename.

It's like spam...lots of output for a little result. But ultimately at no cost to the perpetrators.

Diary Archives