The amazingly scary xz sshd backdoor
Unless you took the whole weekend off, you must have seen by now that Andres Freund published an amazing discovery on Friday on the Openwall mailing list (https://www.openwall.com/lists/oss-security/2024/03/29/4).
The whole story around this is both fascinating and scary – and I’m sure will be told around numerous time, so in this diary I will put some technical things about the backdoor that I reversed for quite some time (and I have a feeling I could spend 2 more weeks on this).
There is also a nice gist by smx-smx here that gets updated regularly so keep an eye there as well.
The author(s) of the backdoor went a long way to make the backdoor look as innocent as possible. This is also why all the reversing effort is taking such a long(er) time. Let’s take a look at couple of fascinating things in this backdoor.
String comparison
One of the first things a reverse engineer will do is to search for strings in the code they are looking at. If strings are visible, they can usually tell a lot about the target binary. But if we take a look at the library (and for this diary I am using the one originally sent by Andres) we will see practically no visible strings.
The authors decided to obfuscate all strings – in order to do that, they stored strings as a radix tree (also known as prefix tree or trie, more info at https://en.wikipedia.org/wiki/Radix_tree). This allows them to store all strings as obfuscated, however now one of the challenges they had was to lookup strings – they implemented a function that checks whether a string exists in the radix tree table, and if it does, it returns back the offset:
The image above shows start of the function (originally called Lsimple_coder_update_0) where I also expanded one of the radix tree tables (_Llzip_decode_1). So as input we send a string, and if it exists in the radix tree, we get back its offset. The code needs to do this for every single string lookup, but that should not cause the delay as radix tree lookups are quite fast.
Anti-debugging
Anti-debugging efforts were implemented on several levels. The simplest one includes code where the backdoor is checking if a function starts with the ‘endbr64’ instruction. Let’s check how this works – first the snippet below shows how the .Llzma_block_buffer_encode.0 function gets called. Notice the value 0x0E230 being passed in rdx:
The following figure shows how the backdoor checks whether the ‘endbr64’ instruction is still there (if not – it has been probably changed by a debugger which signals the backdoor to abort):
Another important register here is rdi, which will points to the location of the ‘endbr64’ instruction. In other words, if everything is OK, value of [rdi] must be 0xFA1E0FF3, which is the opcode for the ‘endbr64’ instruction.
So now when we go through code, the line add eax, [rdi] should sum 0xFA1E0FF3 and 0x5E2E230 which results in 0xF223. If that is correct 1 is returned, otherwise 0 (indicating that the 'endbr64' instruction was not there). Quite cool.
There is more another function that dynamically modifies code, but that is horrible for reversing – perhaps later ?
Collecting usernames and IP addresses?
The final function we’ll take a look at is also interesting – it will parse every log created by the sshd service and will try to extract valid usernames and IP addresses. The function is .Llzma12_mode_map.part.1 and we can see the initial comparisons (with my comments) here:
So with this, the attacker is parsing a log line such as the following:
Accepted password for root from 192.168.14.1 port 62405 ssh2
Further below in the function we can see that they match on the ‘from’ and ‘ssh2’ keywords and extract what’s between them:
Finally, they pass this to another function, however that’s not visible in static analysis so this is where I lost it.
Guess that’s it for today – depending on developments we might update this diary but in any case check the resources posted at the beginning as they get updated quite often.
If you are looking at this too - hope you're having fun. I quite enjoyed this (too bad it's back to dayjob soon heh), but I'm also scared about how *close* this was to infect zillion servers.
Comments