Another Malicious HTA File Analysis - Part 2

Published: 2023-04-10
Last Updated: 2023-04-10 08:13:31 UTC
by Didier Stevens (Version: 1)
0 comment(s)

The first part in this series can be found here. In the first part, we ended with a decoded PowerShell script. We will now start to decrypt the payload found inside this PowerShell script:

1: right under number 1, you can find a variable that contains the BASE64 encoded ciphertext.

2: right under number 2, you can find a variable that contains the BASE64 encoded key.

3: right under number 3, you can see that this is AES encryption

4: right under number 4, you can see that ECB mode is used

5: right under number 5, you can see that the first 16 bytes of the BASE64-decoded payload, is the initialisation vector (although ECB mode does not use an IV)

6: right under number 6, you can see GZip decompression classes & methods.

Thus, to obtain the decoded payload, we need to BASE64-decode it, decrypt it and decompress it.

Let's start with, my tool to extract & decode BASE64 strings:

With option n, we can specify a minimum length for BASE64 strigns, so that we only have the long strings that interest us (ciphertext and key):

And then, we use option --jsonoutput to produce JSON output that contains the complete extracted BASE64 strings. This JSON format can be processed by other tools I develop.

Here specifically, we will use That is a tool that reads JSON data produced by my tools, and then transforms that data with a user provided Python function.

The script containing this function we will use is the following:

from Crypto.Cipher import AES
import gzip

def Transform(items, options):
    ciphertext = items[0]['content']
    key        = items[1]['content']

    iv = ciphertext[:AES.block_size] # unused
    cleartext = oAES.decrypt(ciphertext[AES.block_size:])
    transformed = gzip.decompress(cleartext)

    items[0]['content'] = transformed
    return transformed

This script defines a function Transform, that is called by my tool

As parameters, it receives a list (items) will the decoded JSON data and it received the options passed on to my tool.

The first 2 lines of the script import modules necessary for AES decryption and GZip decomrpession.

The ciphertext extracted and BASE64-decode by tool from the PowerShell script, is the first item in the list.

The encryption key is the second item in the list:

Each item is a dictionary, and the decoded data can be found under key 'content'.

We assign this to variables ciphertext and key.

We also extract the initialisation vector from the first 16 bytes of the ciphertext into variable iv, but this will not be used as the malware developers use ECB mode, and that mode works without an IV.

Then we create an AES object (oAES) with the key and ECB mode.

We call method decrypt of object oAES, giving it the ciphertext (excluding the first 16 bytes, e.g., the IV): we store the decrypted data into variable cleartext.

Then we decompress the cleartext and store it into variable transformed.

Finally, we store the decoded payload (variable transformed) back into the items list and we make it the return value of our Transform function.

This summarizes how function Transform works.

We can let use this function as follows:

The decrypted payload is another PowerShell script ...

Notice that this PowerShell script contains a series of numbers (4800+), and a single number in the same range: 4761.

As explained in diary entry "Extra: "String Obfuscation: Character Pair Reversal"", this is an encoded payload that can be decoded with my tool

Like this:

We see some extensions, a .Net object, and a URL.

The file obtained from this URL, is a .bat file and can be found on MalwareBazaar too.

This is the complete command to extract the URL from the HTA file: -D | --split ":" --regex "^a.+=(.+) - &H(.+)$" -j "" "chr(int(oMatch.groups()[0]) - int(oMatch.groups()[1], 16))" | -n 16 --jsonoutput | -s | -i "n - 4761"

Didier Stevens
Senior handler
Microsoft MVP

0 comment(s)


Diary Archives