Researchers Scanning the Internet
We have been using our data to identify researchers scanning the internet for a few years. Currently, we are tracking 36 groups performing such scans, and our data feed of the IP addresses used contains around 33k addresses [1].
Of course, no clear definition of when a scan is inappropriate exists. Some consider any scan performed nationally and without permission to be unethical. Others have a higher bar, for example, considering scans appropriate if they do not exploit vulnerabilities or cause damage. Legal frameworks vary around the world.
Earlier today, Caleb reminded me of RFC 9511, which I believe offers some good ideas and should be considered if you plan to perform an internet-wide scan [2]. The RFC is entitled "Attribution of Internet Probes." It gets to one of the main issues: Identify yourself if you are performing these scans. This way, if you are causing problems, targets can contact you. This should be a minimum requirement to limit unintentional damage.
Can a simple "scan" cause damage? Of course, it can! We had plenty of examples of such scans causing problems. My favorite example is an old Cisco bug that caused routers to crash if they were scanned with empty UDP packets.
RFC9511 suggests adding a URL to your probe packets and a probe description file at "/.well-known/probing.txt." The IP address the probe originates from should reverse resolve to a hostname, and the probe description file can be found at that hostname. Alternatively, the host the probe originates from should run a web server offering the file. Or the probe description URL should be included as a payload.
For web-based scanning, I see many scanners adding a URL to the user-agent header, which I think fulfills what RFC 9511 is attempting to achieve.
In the past, we have received requests to exclude these scanners from our "Blocklist" (https://isc.sans.edu/block.txt). So far, I have not removed them. But I will consider removing them as long as they are easily identified. There is usually little value in blocking them from your network. If you want to block them, consider our API data at your own risk [1]. But for the most part, I think this feed is more helpful in helping you identify these scanners in your logs to better judge the intention of a particular log entry. Some of our honeypots block these requests as they do not necessarily add value to our data collection, and these scans may make it easier to fingerprint our honeypots.
Most importantly, Before starting your internet-wide scans, consider using existing data. Organizations like Shodan and CENSYS do provide some access to their data and may even share it with researchers.
How do we know if a scan is done for non-malicious purposes? We take people's word for it. Unless we can show malicious intent, if someone claims to scan the internet for research purposes, we believe them. This includes commercial entities that may offer attack surface monitoring to their clients.
And while I was working on this diary, I think I just saw two new organizations I would add to my feed shortly.
[1] https://isc.sans.edu/api/threatcategory/research
[2] https://datatracker.ietf.org/doc/rfc9511/
---
Johannes B. Ullrich, Ph.D. , Dean of Research, SANS.edu
Twitter|
Comments