Malware and XOR - Part 1

Published: 2017-06-05
Last Updated: 2017-06-05 22:01:23 UTC
by Didier Stevens (Version: 1)
2 comment(s)

Malware authors often encode their malicious payload, to avoid detection and make analysis more difficult.

I regurlarly see payloads encoded with the XOR function. Often, they will use a sequence of bytes as encoding key. For example, let's take Password as encoding key. Then the first byte of the payload is XORed with the first byte of the key (P), the second byte of the payload is XORed with the second byte of the key (a), and so on until all bytes of the key have been used. And then we start again with the first byte of the key: the ninth byte of the payload is XORed with the first byte of the key (P), ...

Let's see what this gives with a Windows executable (a PE file), like this one:

The XOR function has some interesting properties for us analysts. XOR a byte with 0x00 (zero), and you get the same byte: XOR with 0x00 is the identity function (f(x) = x).

Since a normal PE file has many sequences of 0x00 bytes, an XOR encoded PE file will contain the encoding key, like here:

So just by opening a XOR encoded PE file with a binary editor, we can see the repeating key, provided that the key is smaller than the sequences of 0x00 bytes.

Second interesting property of the XOR function: if you XOR the original file (cleartext) with the encoded file (ciphertext), you get the key (or to be more precise, the keystream).

Let's take another example. We know that in many PE files, you can find the string "This program can not be run in DOS mode." in the MZ header (or something similar). Here is this encoded string in the encoded PE file:

If we XOR this encoded string with the unencoded string, we obtain the key:

So if we have the encoded file, and the partially unencoded file, we can also recover the key, provided again that the key is smaller than the unencoded text, and that we know where to line-up the encoded and unencoded text.

In a next diary entry, I will show a tool to automate this analysis process.


Didier Stevens
Microsoft MVP Consumer Security

Keywords: malware xor
2 comment(s)


Interesting blog post! Instead of using the text in the PE header you could also go for zero padding at the end of the file. Here is a quick proof of concept that worked for me in most cases:
great post. I love the way you explain basics of cryptography.

Your method reminds me other trick I've used during my small malware analysis escapades and written helper tool for that.

If you know string of symbols (could be binary or ASCII) that should be present in plaintext, then you could create
template for it (XOR symbols that are key_lenght apart from each other), find it's position in XOR encrypted binary
and from there easily recover XOR key.

There is the tool you could try (full disclosure - I'm the author) or with description there build your own:


Diary Archives