Quick Howto: Extract URLs from RTF files
Malicious RTF (Rich Text Format) documents are back in the news with the exploitation of CVE-2026-21509 by APT28.
The malicious RTF documents BULLETEN_H.doc and Consultation_Topics_Ukraine(Final).doc mentioned in the news are RTF files (despite their .doc extension, a common trick used by threat actors).
Here is a quick tip to extract URLs from RTF files. Use the following command:
rtfdump.py -j -C SAMPLE.vir | strings.py --jsoninput | re-search.py -n url -u -F officeurls
Like this:

BTW, if you are curious, this is how that document looks like when opened:

Let me break down the command:
- rtfdump.py -j -C SAMPLE.vir: this parses RTF file SAMPLE.vir and produces JSON output with the content of all the items found in the RTF document. Option -C make that all combinations are included in the JSON data: the item itself, the hex-decoded item (-H) and the hex-decoded and shifted item (-H -S). So per item found inside the RTF file, 3 entries are produced in the JSON data.
- strings.py --jsoninput: this takes the JSON data produced by rtfdump.py and extract all strings
- re-search.py -n url -u -F officeurls: this extracts all URLs (-n url) found in the strings produced by strings.py, performs a deduplication (-u) and filters out all URLs linked to Office document definitions (-F officeurls)
So I have found one domain (wellnesscaremed) and one private IP address (192.168...). What I then like to do, is search for these keywords in the string list, like this:

If found extra IOCs: a UNC and a "malformed" URL. The URL has it's hostname followed by @ssl. This is not according to standards. @ can be used to introduce credentials, but then it has to come in front of the hostname, not behind it. So that's not the case here. More on this later.
Here are the results for the other document:


Notice that this time, we have @80.
I believe that this @ notation is used by Microsoft to provide the portnumber when WebDAV requests are made (via UNC). If you know more about this, please post a comment.
In an upcoming diary, I will show how to extract URLs from ZIP files embedded in the objects in these RTF files.
Didier Stevens
Senior handler
blog.DidierStevens.com
YARA-X 1.13.0 Release
YARA-X's 1.13.0 release brings 4 improvements and 4 bugfixes.
Didier Stevens
Senior handler
blog.DidierStevens.com

Comments