Browser *does* matter, not only for vulnerabilities - a story on JavaScript deobfuscation

Published: 2006-07-28
Last Updated: 2006-07-28 11:15:47 UTC
by Bojan Zdrnja (Version: 1)
0 comment(s)
Reader Gilbert sent us a link to a site that caused couple of alerts with his anti-virus program. Initially this looked like a typical web site hosting malware, but it turned out to be much more.

The HTML document was absolutely standard, except for one iframe which was, of course, hidden. This raised our eyebrows and we started following what turned out to be an interesting obfuscation.

Before we start with this we should stress out that, if you decide to do something similar, it is always recommended that you do this on an isolated machine ? virtual machines are ideal candidates for playing with malware of this type.

First Layer

The iframe pointed to a JavaScript file which used (today) more or less standard obfuscation: a function was defined with various permutations and it was called with a document.write at the end.
The obfuscation typically looks like this:

function dc(x){var l=x.length, ?
?. Various permutations ?
document.write(r)}}

dc(a_HDcBY@icbCvFIEjg ... long ASCII string ...);

As you can see, the dc() function is called with the encoded string, and document.write() is called at the end. This results in another JavaScript which is executed immediately after it's decoded by the document.write() call.
Decoding this JavaScript is typically pretty easy and there are numerous ways of doing it. The easiest way of what's going is just to replace the document.write() call with the alert() call ? this will cause your browser to print the whole new code in an alert popup.
If you want to be more creative, you can add code which will replace all < or > characters to something else ? this will cause your browser not to parse it as JavaScript.

Another nice trick is to add document.write("<textarea rows=100 cols=100>"); as first line after the script tag and document.write("</textarea>"); as last line. The decoded javascript will now show up nicely inside the text area which can be copy&pasted easy. Thanks to Tom and Johannes for this trick.

Second layer

So, once the first step was decoded we were greeted with another obfuscated code. This code looked similar to the previous one (but boy, we were wrong with that). This time the (de-)obfuscation function didn't end with a document.write() call, but with an eval() call. The eval() call evaluates a string and executes it as if it was script code. In other words, when obfuscating code, the eval() call is typically used to call some part of the existing code again.

This part is what caused us the most problems, and here's why. So, the deobfuscating code looks like this (non interesting parts have been removed):

function r(lI,t) {
?
for(var sa=0;sa<lI.length;sa+=arguments.callee.toString().length-444)
{
?
Various permutations;
}
eval(ii);
};
r('string');

So again, the function r() is called with a long ASCII string. There is a for loop which does various permutations and then an eval() is called. So, the first step we did here was to replace eval(ii) with alert(ii). That was easy enough but a surprise was waiting for us ? when we executed that we just got some binary output. Hmm, that can't be right.

After studying the permutation for loop a bit (permutations itself aren't interesting for us) we saw something interesting in the for loop definition itself. You can see that the sa variable is being increased in every step by something weird: sa+=arguments.callee.toString().length.

We never saw this before and (not being JavaScript experts) went to see what arguments.callee actually is. After some search, we found couple of references, like this one on Mozilla's web site: http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Objects:Function:arguments:callee.

Now the whole night became very very interesting. Arguments.callee specifies the function body of the currently executing function! So this code actually uses itself to do the permutation ? if you change something in the code (like we did when we replaced the eval() call with alert()) you will break the whole deobfuscation part!

Firefox is not IE and vice versa

But, this is not the end of the story. In this case, the sa variable is being increased by the value which is calculated from the function string length (arguments.callee.toString().length returns the number which is the length of the whole function in characters) decreased by 444.
So, if when replaced eval() with alert(), we added 1 character to the function length, so we had to increase this number to 445. Easy ? we changed that and started this in Mozilla and it didn't work again!

Two hours later (fast forward for you) we found a very interesting thing: Mozilla Firefox and Internet Explorer don't return same values when arguments.callee.toString().length is called on a function!
This is easy to test if you want ? just create the following HTML file:

<html>
<head>
    <script type="text/javascript">
    <!--

    function func(){var l = arguments.callee.toString().length;alert(l);}

    func();
    //-->
    </script>
</head>
</html>

This JavaScript creates a function called func and then displays content of the variable "l" ? the content will be whatever the call arguments.callee.toString().length return.
On our test machine, when this JavaScript is executed in Mozilla Firefox we get 81, while in Internet Explorer we get 69.
Why this is happening is another story, so lets get back to our deobfuscation.

A recursive call

Knowing this we now know that our alert() call with increased number (444 to 445) was actually ok, but we have to execute this from Internet Explorer, or we have to calculate new numbers for Mozilla.
We decided to use Internet Explorer (in a virtual machine, of course) and voila ? we got the eval() call content: it was another call to the r() function. Knowing how the r() function works, all we had to do was call it again and we got another obfuscated JavaScript (this is 4th step already).

This time, our job was easier. This part just consisted of an unescape() call. The unescape call basically has ASCII characters which are just shown as values. You can even "crack" this manually, but it is easier if you just assign this call to another variable and then display that with another alert() call.

After the unescape() call we were greeted with another obfuscation routine, which was simple this time ? it was actually very similar to the routine used in the first step, so all written above applied to this step as well.

The final result

And we finally arrived to the end of the whole (we caught the white rabbit). The final result was a bit disappointing at first ? it was another iframe to a PHP file.

That PHP file is exploit for MS06-014, RDS.Datastore data execution which dropped a downloader on the victim's machine. The downloader in turn downloaded second stage, which was a keylogger called Trojan.Anserin.

The dropped malware was more or less typical, but this was indeed a very interesting investigation, which taught us some new things about obfuscation which we haven't seen before.

Thanks to fellow handlers Arrigo and Swa for help with the analysis.

Keywords:
0 comment(s)

Comments


Diary Archives