How to best start the new year? How about a new tool: what-is-new.py. It's something I have to do often, and I'm sure you do too: you make lists at regular intervals (for example every week), and you want to know what is new, e.g. what haven't you seen before. This is what my tool what-is-new.py helps you with: you give it text files, and it reports every line it hasn't seen before (it keeps a database). For example, I use this tool to review the User Agent Strings of the HTTP(S) requests to my web servers. Every week I produce a list of User Agent Strings found in my web server logs, and feed this to what-is-new: this gives me a list of User Agent Strings not seen before. Detail: the problem is that User Agent Strings contain version numbers, and that makes for a long list of "new" User Agent Strings every week. I solve this problem by using a custom, canonical representation of the User Agent String: I only keep the letters. For example, User Agent String "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Safari/534.30 CyanogenMod/10.2/grouper" becomes "Mozilla X Linux x AppleWebKit KHTML like Gecko Version Safari CyanogenMod grouper". By using this representation, I have about 50 new User Agent Strings every week. Here are some interesting ones found in the last months: Nikto: Canonical: Actual: And apparently, someone visited my site from a Cray supercomputer :-) "Mozilla/0.3 (Cray UNICOS) Lynx/2.0.113.0" Some visitors cherish their privacy explicitly: "Mozilla/5.0 (have a guess) recent but undisclosed" And finally, since cryptocurrencies have become so popular: "whoismining.com Bot/1.0" This is from a web site that checks if web sites use your browser to mine crypto currencies: Best wishes from the Internet Storm Center!
Didier Stevens |
DidierStevens 532 Posts ISC Handler Jan 1st 2018 |
Thread locked Subscribe |
Jan 1st 2018 3 years ago |
Sign Up for Free or Log In to start participating in the conversation!