Obfuscating without XOR

Malicious files are generated and spread over the wild Internet daily (read: "hourly"). The goal of the attackers is to use files that are:

  • not know by signature-based solutions
  • not easy to read for the human eye

That’s why many obfuscation techniques exist to lure automated tools and security analysts. In most cases, it’s just a question of time to decode the obfuscated data. A classic technique is to use the XOR cypher[1]. This is definitively not a new technique (see a previous diary[2] from 2012) but it still heavily used. And many tools can automate the search for XOR’d string. Viper, the binary analysis and management framework, is a good example. It can scan for XOR'd strings easily:

viper tmpnYaBJs > xor -a
[*] Searching for the following strings:
- This Program
- GetSystemDirectory
- CreateFile
- IsBadReadPtr
- IsBadWritePtrGetProcAddress
- LoadLibrary
- WinExec
- CreateFileShellExecute
- CloseHandle
- UrlDownloadToFile
- GetTempPath
- ReadFile
- WriteFile
- SetFilePointer
- GetProcAddr
- VirtualAlloc
- http
[*] Hold on, this might take a while...
[*] Searching XOR
[!] Matched: http with key: 0x74
[*] Searching ROT
viper tmpnYaBJs >

Today, many Javascript or VBS files implement other obfuscation techniques that do not rely on XOR. Yesterday, I found a sample that had such behaviour. A first quick analysis revealed that almost no string was in clear text in the source and a function was called in place of regular strings like:

var bcacfdfaebbbfDeck = new ActiveXObject(dbdbfaeefccaee('+L+^%^LK%,LpL(KeL^%z%+%u%u',1));

I took some time to check how the obfuscation was performed. How does it work?

The position of each character is searched in the $data variable and decreased by one. Then the character at this position is returned to build a string of hex codes. Finally, the hex codes are converted into the final string. Example with the two first characters of the example above:

$data = "SYOm7L-3^o&x4(CuD0p5+@rW*qvUEec!8zZsQhdIwaHn:Tf9,Vyil6%;jXtMA2Kbk_FN)GB.$1PJgR";

  • "+" is located at pos 20, search the character at position 19 (20 - 1): "5"
  • "L" is located at pos 5, search the character at position 4 (5 - 1): "7"
  • "57" is the hex code for "W"
  • etc...

Here is the beautified code from the malicious file:

// Convert a string from hex chars to string.
// In: “575363726970742E7368656C6C"
// Out: ""
function hex2string(hexstring) {
    var bufferin = hexstring.toString();
    var bufferout = '';
    for (var i = 0; i < bufferin.length; i += 2)
        bufferout += String.fromCharCode(parseInt(bufferin.substr(i, 2), 16));
    return bufferout;

// Convert the obfuscate string by shifting by 1 char 
function deobfuscate(string,step){
    var data = "SYOm7L-3^o&x4(CuD0p5+@rW*qvUEec!8zZsQhdIwaHn:Tf9,Vyil6%;jXtMA2Kbk_FN)GB.$1PJgR";
    var bufferout = "";
    var l = data.length-1;
    var size = string.length;    

    for (var i = 0; i <size ; i++){        
        var p = data.indexOf(string.charAt(i));        
        var p2 = p - step;        
        if (p2 < 0) {            
            p2 = l - Math.abs(p2);
            var l2 = l - 1;            
            if (p2==l2)
               p2 = p2 + step;
        bufferout = bufferout + data.charAt(p2);
    // Convert to string
    return hex2string(bufferout);

This code:

var s = deobfuscate('%zL(L(Lp^2KNKN^P^z^+Ke^P^+^(Ke^+^KKe^P^p^PKN%u%N%L%NKe%,%0%L',1);



And when you understand how to deobfuscate, it’s easy to write the opposite function. So I quickly wrote the function to obfuscate any string based on the same technique:

function obfuscate(string,step){
    var data = "SYOm7L-3^o&x4(CuD0p5+@rW*qvUEec!8zZsQhdIwaHn:Tf9,Vyil6%;jXtMA2Kbk_FN)GB.$1PJgR";
    var bufferout = "";
    var l = data.length-1;
    var size = string.length;
    for (var i = 0; i <size ; i++){
        var hvalue = Number(string.charCodeAt(i)).toString(16).toUpperCase();
        for (var j=0; j < 2; j++) {
            var p = data.indexOf(hvalue.charAt(j));
            var p2 = p + step;
            if (p2<0) {            
                p2 = l + Math.abs(p2);
                var l2 = l + 1;            
                if (p2==l2)
                    p2 = p2 - step;
            bufferout = bufferout + data.charAt(bdfcbaddccffada);
    return bufferout;

This code:

var foo = obfuscate("", 1);



Of course, the method analyzed here is a one shot! The number of ways to obfuscate data is unlimited...


Xavier Mertens (@xme)
ISC Handler - Freelance Security Consultant


> Of course, the method analyzed here is a one shot! The number of ways to obfuscate data is unlimited...

Just the same, it is a general method with a long history of use in manual ciphers. It is a polyalphabetic substitution cipher. The Wikipedia article says, "The Alberti cipher by Leon Battista Alberti around 1467 is believed to be the first polyalphabetic cipher." Yet I hadn't before read of it being used in malware.



