Last Updated: 2022-10-22 20:30:51 UTC
by Didier Stevens (Version: 1)
Due to the nature of the RTF language, malicious RTF files can be very obfuscated.
To analyze a heavily obfuscated RTF maldoc like this one (1c8cfccd2e45ea898125a62686ee97a1e923dfbbc8652889027d46b04aa5dc75), one needs to use rtfdump.py's different options to select the most suspicious items and try to decode them.
To try to automate part of this manual process, I implemented option -F:
With option -F, rtfdump searches through all items with hexadecimal strings, tries to decode them (combining -H and -S) looking for OLE files (files that start with D0CF11E0).
You can direct rtfdump to search for other types of files by using option --findcutexpression:
But here, with option -F, one ole file was found. Let's pipe it into oledump.py:
It contains one stream. Let's use option --storages to view the storages, and option -E "%CLSID% %CLSIDDESC%" to view the class ids:
It's an equation stream. Let's take a look at the content of the stream:
01 is a line record (IIRC) and 08 is a font record. That's where the exploit starts:
Let's extract the complete content of this stream, write it to disk, and have scdbg analyze this 32-bit shellcode: