Two months ago, ISC handler Maarten Van Horenbeeck did a great diary on how to extract exploit content from malicious PDF files. Since we are seeing a steady number of these PDFs and PDF-borne exploit attempts, here's a refresher on how to untangle them. Start with reading Maarten's diary again.
To untangle these blocks, you can use a simple Perl script
cat nasty.js | perl -pe 's/\%u(..)(..)/chr(hex($2)).chr(hex($1))/ge' | hexdump -C | more
This converts the Unicode (%u...) to actual printable ASCII. Since most of the Unicode block is assembly (shell code), the result won't be pretty, this is why we pipe it in to hexdump.
But wait, we are changing %u (hex) to ASCII and then back to a Hexdump? Yes. The reason for this is that the byte order of %uxxyy has to be swapped (yy xx) to get readable text. And "hexdump -C" also prints ASCII where printable. Thusly:
00000320 b5 64 04 64 b5 cb ec 32 89 64 e3 a4 64 b5 f3 ec |µd.dµËì2.dã¤dµóì|
And lo and behold, we have the name of the next stage EXE that this particular exploit is trying to download.
Things are not always this easy though - sometimes, the URL of the next stage is encoded. Time permitting, I'll add an example on how to crack one of those later today.
Sep 3rd 2008
8 years ago