Getting the EXE out of the RTF again

Published: 2010-03-26
Last Updated: 2010-03-26 14:19:15 UTC
by Daniel Wesemann (Version: 1)
1 comment(s)

Since we got some mails from readers who had trouble getting the malware extraction technique described in http://isc.sans.org/diary.html?storyid=6703 to work on yesterday's malicious "copyright lawsuit" sample , here's a quick walk-through again on how to carve an EXE out of a DOC or RTF file. 

$ file suit_documents.doc
suit_documents.doc: Rich Text Format data, version 1, ANSI

Hmm, looks like this DOC is an RTF ... Let's see what it contains

$ head suit_documents.doc
{rtf1ansiansicpg1252deff0{fonttbl{f0fswissfcharset0 Arial;}}
{*generator Msftedit 5.41.15.1515;}viewkind4uc1pardlang1033f0fs20{objectobjemb{*objclass Package}objw795objh765{*objdata
01050000
02000000
08000000
5061636b61676500
00000000
00000000
6f740000
0200646f63732e70646600433a5c446f63756d656e747320616e642053657474696e67735c4164

OK .. looks indeed like an RTF with an embedded object. The pile of numbers are all ASCII codes in Hex, but before we can convert them to readable characters, we first have to strip away the initial two lines, because their presence would confuse the Perl statement that follows later.

$ cat suit_documents.doc | sed '1,2d' > suit1.temp
$ head suit1.temp
01050000
02000000
08000000
5061636b61676500
00000000
00000000
6f740000
0200646f63732e70646600433a5c446f63756d656e747320616e642053657474696e67735c4164

Now, we are ready for the transformation from Hex ASCII codes to printable characters:

$ cat suit1.temp | perl -ne 's/(..)/print(chr(hex($1)))/ge' > suit2.temp

So far, the old method still seems to work: We locate "objdata" in the RTF document, strip out everything in front, then feed the blob into Perl to convert the hexadecimal codes to actual ASCII characters. I changed the Perl command slightly compared to the earlier diary on the subject, because one of the problems that people seem to have is related to how "end of line" is treated on Windows vs Unix. The earlier version

$cat detail.rtf | sed -e '1,3d' | perl -ne 's/(..)/print chr(hex($1))/ge' > detail.bin

kept any DOS line terminators unchanged, which doesn't bode well for the resulting executable.  The new version

$ cat suit1.temp | perl -ne 's/(..)/print(chr(hex($1)))/ge' > suit2.temp

is now really only printing out converted hex codes, and is dropping all the CR/LF line terminators that are present in the original file after every line. The resulting file is still in "Object Package" format, but if you look closely, you can see the tell-tale "MZ" that marks the start of an executable:

What makes this case a bit more convoluted than last year's example is that the bad guys tried real hard to disguise the contents. This time, the initial file had a .DOC extension, but was in fact an .RTF format, which contained an embedded COMPLA~1.EXE that had a harmless looking Icon (3.ico) and was displayed to the user as "docs.pdf". Yup,pretty sneaky. You can see all these file names in the hex output above.

Now, how to get the EXE out. According to the mentioned earlier diary, the numbers between the EXE filename and the "MZ" header mark the size of the executable that we need to cut out. In this case, we have "00 10 74 00 00" in that position:

00000070 4c 41 7e 31 2e 45 58 45 00 10 74 00 00 4d 5a 90 |LA~1.EXE..t..MZ.|

What my earlier example didn't make clear is that these numbers have to be read "right to left" to determine the size. In the current case, the size is 007410hex, which converts to 29712 bytes.

Let's carve it out. We need to skip to position 0x7D (=125) at the beginning of the file to get to the "MZ" marker, and from there, the EXE should be 29712 bytes long.

$ dd if=suit2.temp of=suit2.exe skip=125 count=29712 bs=1
29712+0 records in
29712+0 records out
29712 bytes (30 kB) copied, 0.15203 s, 195 kB/s

$ md5sum suit2.exe
ead062fb0aca0e3d0e8c12c4cf095765 suit2.exe

Voilà! Now, we can use this hash on http://www.virustotal.com/buscaHash.html to see if someone else has analyzed this file before :) 

 

 

1 comment(s)

Comments

Excellent! Thanks Daniel, I had a suspicion it was those pesky zero's throwing me off.

Diary Archives