YARA XOR Strings: Some Remarks

Published: 2018-10-07
Last Updated: 2018-10-07 23:01:12 UTC
by Didier Stevens (Version: 1)
0 comment(s)

There were quite some comments on yesterday's diary entry "YARA: XOR Strings".

As pointed out by rebus, in some cases it's not so usefull that option -s --print-strings outputs the encoded string.

One potential work-around is to use my tool XORSearch after a YARA rule triggered: it will list the cleartext string along with the XOR key.

Reader robv points out that the YARA documentation does not explicitly mention XOR string modifier support for regular expressions. That's what I read into it too, and why I was surprised that XOR string modifiers don't generate an error/warning when used with a regular expression.

And regarding performance. It has an impact, depending on your environment.

I've done some YARA "speed tests" in the past, and there are several parameters that influence such tests.

First of all, on Windows (haven't tested on other OSes yet), each file is read (mapped into memory) before it is scanned. Even when I use a dummy rule (like "rule dummy {condition: false}"), the complete file is processed.

When I do tests, caching has a huge impact. Running YARA with a single rule on a 4.2GB file (a Windows installation .iso file) for the first time, takes 64 seconds. The second time, same rule and same file, it takes 19 seconds.

Subsequent runs have variations of several hundreds of milliseconds.

YARA is also multithreaded. Running with a single thread or multiple threads makes a difference in execution time.

So when you do performance tests, it's best to limit the influence of these parameters, for example by using a single thread and running the command several times (to cache the file).

Scanning that 4.2GB with the first YARA rule of my diary entry takes 19 seconds (average), and the same rule without XOR modifier takes 8 seconds (average).

That's because of the way YARA works (with atoms used by the Aho-Corasick algorithm) and how XOR is implemented: an atom extracted from a string leads to 255 atoms when the XOR modifier is applied.

Nevertheless, it also depends on the content you are scanning, I'm able to create a file where the opposite is true: a "normal scan" takes 19 seconds and an "XOR scan" takes 8 seconds.

Explaining this requires more time, that I'll dedicate to an upcoming diary entry.

Didier Stevens
Senior handler
Microsoft MVP
blog.DidierStevens.com DidierStevensLabs.com

Keywords: XOR YARA
0 comment(s)


Diary Archives