Phish or scam? - Part 2

Published: 2017-12-18
Last Updated: 2017-12-18 07:03:28 UTC
by Didier Stevens (Version: 1)
0 comment(s)

We continue the MSG analysis of yesterday.

There are several ways to take a look at the text contained in a Word .docx file without using MS Office.

Here we will look at the raw XML. The content of a Word file is stored in the following file:

As you can see, the text of the document is contained between XML tags. Filtering out these XML tags, for example with a regular expression and SED, reveals the text without any formatting:

But it can be harder to understand without any new lines. And sometimes, this method will strip away info you want to see.
That is why I wrote a simple tool in Python that reads XML and can extract various information:
You can achieve the same result as with sed by using command text:

Command wordtext is like command text, but it looks for paragraphs (<w:p>) and inserts a newline after extracting the text of each paragraph:


From the content of the Word document, it's clear that this is a scam.
Just for the sake of trying to be thorough, I poked around a bit looking for exploits or feature abuse (like DDE), but found nothing.

Didier Stevens
Microsoft MVP Consumer Security

Keywords: maldoc phish scam spam
0 comment(s)


Diary Archives