Enriching Radare2 and x64dbg malware analysis with statically decoded strings

Published: 2018-09-27. Last Updated: 2018-09-27 22:20:17 UTC
by Renato Marinho (Version: 1)

Today, I came across a bloated malware sample (292 Mb) full of encoded strings being distributed in Brazil through compromised WordPress websites as fake Java Updates.

One common first step while analyzing malware samples is trying to decode its strings. Although we cannot rely solely on it to make our assumptions, they may take us to significant findings.

After digging into the decryption algorithm used to decode malware strings in runtime, it was possible to write a simple Python script to reverse them statically. This way, I ended up with a list of decoded strings. They gave me many insights about the malware intentions, but it would be better to have them back to the malware while analyzing it statically or dynamically.

Studying the subject, I learned that Floss [1], a tool developed by FireEye to automatically extract obfuscated strings from malware, can generate scripts to annotate Radare2 and IDA Pro databases with the decoded strings as comments by the assembly code. It was exactly what I was looking for, but, unfortunately, Floss wasn’t able to decode the binary I was analyzing.

So, I decided to go one step further in my decoding Python script to make it generating annotation scripts. Its implementation is straightforward and requires improvements, but it helped me attaching the decoded strings to the malware while statically and dynamically analyzing using Radare2 and x64dbg respectively.

Decoding function

In today’s malware sample, the strings were encrypted with XOR algorithm. XOR algorithms are frequently used in malware and are not so difficult to reverse using brute force with tools like XORSearch and XORStrings [2] depending on the encryption key size. Bigger key sizes, as in my case, are hard to brute-force and require code analysis as seen in Figure 1.

Figure 1 – Decoding function disassembly

Decoding strings

Now that we understood the decoding algorithm and know the key, it is time to decode the strings. To this end, a simple Python script was implemented, as shown in Figure 2.

Figure 2 – Decoding function in Python

So, giving it one of the encoded strings and the key, it returns the decoded string.

Figure 3 - Decoding sample string

Batch Decoding

Once we have the decoding script working, it’s time decode all the malware embedded strings. To this end, it was necessary to extract all the binary strings and select all of those who matched to the encryption pattern. In Figure 4 it is possible to see part of the string list and its respective offsets.

Figure 4 – Encoded string list

The next step was to make the decoding script parse this file and generate the outputs. Note that the offset is necessary to further attach the decoded strings to the code.

In Figure 5 is shown the code snipped in charge of the parsing and output file generation.

Figure 5 - Generating Radare2 and x64dbg scripts

In Figure 6 we have part of the output for the x64dbg script.

Figure 6 – X64dbg script

Attaching strings back to the code

Now, it is time to run generated scripts. In Radare2 (Cutter), select and run the script by the menu File -> Run Script. In x64dbg, select and run the script while debugging the malware using the tab “Script.”

In both, the decoded strings will be annotated as comments by the assembly code which references the original encoded strings, as shown in Figures 7 and 8.

Figure 7 – Decoded strings in Radare2

Figure 8 – Decoded strings in x64dbg

Final words

As described, this was the way I found to make it easier to make sense of decoded strings while analyzing statically and dynamically this malware. I hope it can be helpful for other analysts. If you know or use different ways to this, share with us!

IOCs

MD5 143081b4031958288dc2a3e9f1d5008d

References

[1] https://www.fireeye.com/blog/threat-research/2016/06/automatically-extracting-obfuscated-strings.html
[2] https://blog.didierstevens.com/programs/xorsearch/

--
Renato Marinho
Morphus Labs| LinkedIn|Twitter

Keywords:

0 comment(s)