Analyzing MSI files

Published: 2018-02-19
Last Updated: 2018-02-19 21:58:25 UTC
by Didier Stevens (Version: 1)
0 comment(s)

Xavier wrote a diary entry about an interesting malware sample: MSI files.

As Xavier mentioned, MSI files are Composite Document Files, or as I like to call them, ole files. MSI files can be inspected with tools that handle OLE files, like 7-Zip, oletools, oledump, ...

I've had to analyze MSI files (bening and malware), and used my tool oledump to search for executables (PE files) inside MSI files. oledump is one of several tools that supports YARA rules. I have a YARA rule, contains_pe_file, that searches for embedded PE files by looking for the MZ and PE header. Here I use oledump with that YARA rule:

In this MSI file, streams 4 and 5 contain a PE file. Looking at the content of stream 4, we can see that it is actually a CAB file (header MSCF) containing a PE file:

MSI file will often contain CAB files.

Stream 5 contains a PE file:

Looking back at the first screenshot, the stream names don't make much sense (they are hexadecimal values), while Xavier's examples show legible steam names. I did some research, and found out that MSI stream names are encoded with unused UNICODE code points. I developed a new oledump plugin, plugin_msi, to decode MSI stream names, and also provide info like the header (ASCII) and MD5 hash of the streams:

The name of stream 5 ( is a good indicator that the embedded PE file is a DLL. This can be confirmed by inspecting the embedded PE file, with a tool like pecheck for example:

If you prefer a GUI tool to analyze MSI files, then know that there are several MSI GUI tools for developers, like Orca.

Do you have a preferred tool to analyze MSI files? Please post a comment!

Didier Stevens
Microsoft MVP Consumer Security

0 comment(s)


Diary Archives