No IOCs? No Problem! Getting a Start Hunting for Malicious Office Files

Published: 2020-04-15. Last Updated: 2020-04-15 14:45:54 UTC
by Rob VandenBrink (Version: 2)
10 comment(s)

Most of us know that macros in Office documents are one of the most common ways to get malware into an organization.  Unfortunately, all to many organizations depend on their AV products to detect these macros and the associated malware.  It's sad fact that macro's are easy to write, and it's not too tough to evade AV by being smart about how you write a malicious macro.

Even worse, there is continued push-back from managment that simply blocking macro's entirely is something that "can't possibly be done", because some critical doc or other might get dropped in the process - usually without any real examples of said critical files.  (Even though from what I typically see, the more critical a document is, the less likely it is to have a macro in it)

So that leaves us with potential malware on the inside of our organization.  If our AV product won't detect it, how can we find it?  As always, my first go-to is "could we write a PowerShell script to help with that?", and it turns out that yes, you can!

What we are looking for is:

  • Office files that have macros in them
  • Office files that have the "it came from the internet" flag set on them
  • Zero byte office files

Let's start from the bottom of that short list and work up.

A "zero byte" file of any kind is a great indicator in itself - often these are files that your AV product actually detected, and prevented from being saved on disk.  Why are we interested in these?  First of all, you might / should want to contact the person involved with that file and discuss with them what they might have been doing at that date/time, and suggest that receiving random office files from strangers is a really bad idea.  If you run these scans frequently, you're likely asking them to think back an hour or two, you're not asking them about last week.  Secondly, just because the file is zero bytes doesn't mean that the macro and associated malware didn't detonate.  You might still want to look at that person's workstation and the other files in their profile directory and other 4data locations.

Next up - "it came from the internet".  This uses a pretty neat feature in Windows called "Alternate Data Streams".  This is actually a pointer to a whole other possible set of file content, which can contain different data (including malware).  Back in the day, this was used to support multiple filesystems (NFS, HFS and so on), however, a main use of alternate datastreams these days is to add various flags to each file - the one we're looking at is called "ZoneID", which stores some indication of where the file came from with the file itself.  You can explore Alternate Data Streams using "dir /R" or the "streams" command in Microsoft Sysinternals ( https://docs.microsoft.com/en-us/sysinternals/downloads/streams )

Enumerating the ZoneID for a single file is pretty simple in PowerShell - first do a "get-content" on the file, which includes the various datastreams of the file, then look for the ZoneID Stream.  The ZoneId can be one of the following values:

0 = "Local machine"
1 = "Local intranet"
2 = "Trusted sites"
3 = "Internet"
4 = "Restricted sites"

PS D:\> $b = (get-content .\example.xlsx -Stream Zone.Identifier)

PS D:\> $b
[ZoneTransfer]
ZoneId=3
ReferrerUrl=https://www.google.com/
HostUrl=http://somefqdn/somepath/example.xlsx


Note that this datastream also includes the source URL and the referrer that the file came from - this can (of course) be useful as well.  If the site is known to be malicious, or is something we might sink-hole in our dns services, that's an IOC right there.  If it's a well-known site, then if the file ends up being malware we might want to pass the word up the chain, to let that organization know that their site might be compromised.

Narrowing our code down to just collect the value of the zoneid:

$zone = ($b -match "ZoneId").split("=")[1]

If that is 3, then the file is marked as "from the internet"


Finally, how can we tell if an office file contains macros?  It turns out that in powershell you can "drive" a session in Word or Excel (or any office app), and just open any of these files.  Once open, the presence of a macro can be collected in a property variable of the open file.  For instance, to check for XL macros:

$objExcel = New-Object -ComObject Excel.Application
#full path to file is required here
$WorkBook = $objExcel.Workbooks.Open("\\someshare\test.xlsm")

$workbook.hasvbproject
True

$WorkBook.close($false)
$objExcel.quit()


Or to check for xl 4 macro sheets, look for the value of $workbook.excel4macrosheets, so in code you would just count them and look for non-zero values:

$workbook.Excel4MacroSheets.count -ne 0

The code for word is slightly different, but the "HasVBProject" variable name stays the same.
Be sure to open the file such that:

  • It's open as read-only.  This means that if someone else has the file open you can still check it, and that as you are checking the file, you're not preventing anyone else from opening it.  It also means that there's no chance that you'll be modifying the file contents, or its timestamp, owner or other metadata as you poke at it.
  • You don't update your MRU (Most Recently Used) list - if you're running this from a station that might actually use office at somepoint, when you open Word or Excel you don't want your "most recent files" list to be full of other peoples' files

All this being said, where should we look for these files?  If you redirect the various "my documents" folders in a group policy, then look there!  If you have a share that is the "store your stuff here" share in your organization, then look there!  If you are looking for inadvertant "oops, I clicked a link" events, then look in users' temp directories.  If the actual user is running the "hunting" script, maybe as part of the login script, the location of that user's temp directory can be read in PowerShell from the environment variable: $env:temp or $env:tmp (by default these have the same value - if they differ then check both)

Anyway, with all that said, we can start collecting data.  In some organizations, any macros at all might be cause for concern.  In other organizations, the "macros from the internet" will be the red flag to look for.  If we collect everything discussed so far, we can slice and dice the collected data any way required once we have it.

# this script checks word and excel documents for zoneid, presence of macros and zero-byte file
# the file owner, last write date and last access date is also collected
# This can easily be extended to include project, visio and other office files
# updated source code will be located at: https://github.com/robvandenbrink

$targetlist = @()
$filelist = @()
$resultslist = @()

### input data ###
# file extensions of interest - update as needed
$exts = "xls","xlsx","doc","docx","docm","dotm","xlm","xlsxm"

# the share to enumerate - use the knowledge of your environment to make an effective choice here
# or as Indiana Jones was told "choose wisely"
# if the user is running this, to check that users' temp directory:
# $targetshare = $env:temp

# or if you are targeting a user or department share, specify the
# full path to the share
# $targetshare = "\\some\fully\qualified\unc"
# in any case, update this variable to best suite your organization and situation:

$targetshare = "L:\testing"


# add a trailing backslash if not present in the share defined above
if ($targetshare.substring($targetshare.length -1) -ne "\") { $targetshare += "\" }

# collect all of the filenames that match the identified extensions
# this can take a while in a large environment
foreach ($ext in $exts) {
    $fullpath = $targetshare + "*." + $ext
    $targetfiles = get-childitem -Path $fullpath -Recurse -file -Force
    $filelist += $targetfiles
    }

# yes, this loops multiple times, so is less efficient time-wise, but is more efficient in how
# many filenames are collected.  
# The alternative is to make one pass, collect all the filenames and winnow the list down from there
# if you prefer that option, it would look something like:
#
# $allfiles = get-childitem -targetshare -recurse -file -force
# foreach ($ext in $exts) {
#    $tfiles = $targetfiles | where-object { $_.name -like "."+$ext }
#    $filelist += $tfiles
#    }
# or if you want to do it in one line, you can pipe the get-childitem statment into a where-object command,
# with your various extensions hard-coded (hard-coding anything is $bad)


# with the targetfile list collected, loop through and collect:
#            which zone did the file come from?
#            does the file contain macros?
#            when was the file created?
#            when was the file last accessed?
#            is the file password protected?  (another common IoC for malware, but real people do this too)
#            and who saved the file? (who is the file owner)
#
# this opens each file in the matching MS Office application, so it can take a while as well
# be sure to open the various office apps **once**, then open each file in turn, collect the data,
# then close that file before proceeding to the next one.
# be sure to close the office app when done
#

# Open the office apps.  Set them both to run in the background
$objExcel = New-Object -ComObject Excel.Application
$objWord = New-Object -ComObject Word.Application
$objExcel.visible = $false
# Disable macro execution, either using the value or string method

# thanks to our anonymous reader for pointing this out
# also disable alerts (note that this does not apply to alerts due to macros)
$objExcel.AutomationSecurity = 3 # msoAutomationSecurityForceDisable

$objExcel.DisplayAlerts = $false
$objWord.visible = $false
$objWord.AutomationSecurity = 3 # msoAutomationSecurityForceDisable

$objWord.DisplayAlerts = $false

##########
# Vars for file open
# XLImportFormat is set to 5, don't convert anything.  This isn't used, but is needed to
# test for password-protected files (you can't skip variables as you open files)
$ConfirmConversions = $false
$UpdateLinks = 0
$ReadOnly = $true
$AddToRecentFiles = $false
$XLImportFormat = 5

foreach ($indfile in $filelist) {
    $f = $indfile.fullname
    $ext = $indfile.extension

    # zero out critical values for each loop
    $hasmacro = $false
    $hasxl4macro = $false
    $zone = 0
    $pwdprotected = $false
    $zerosize = $false
 
    # zero size?
    if ($indfile.length -eq 0)  { $zerosize = $true }

    # collect alt datastream info (zone)
    $b = (get-content $f -stream Zone.Identifier -erroraction 'silentlycontinue' )

    if ( $b.length -gt 0 ) { $zone = ($b -match "ZoneId").split("=")[1] }


    # EXCEL SECTION

    # skip zero byte files, but record them - possibly AV caught these during a file save
    # also check for and skip pwd protected files. Record them as *potential* malware

    if(( $ext.substring(0,3).tolower() -eq ".xl") -and (-not $zerosize)) {
        # collect excel specific info (are there macros?)
        # full path is required to open the file
        # echo the filename with path, just so we can monitor progress
        # and be sure the script is still running :-)
        write-host $f

        # is it password protected?
        try {
            $WorkBook = $objExcel.Workbooks.Open($f,$UpdateLinks,$ReadOnly,$XLImportFormat,"a")
            }
        catch {
            $pwdprotected = $true
            }
        }
    $error.clear()

    # check the file if we are able to, then close it:
    if((-not $pwdprotected) -and (-not $zerosize)) {
        # excel macros?
        $hasmacro = $workbook.hasvbproject
        $hasxl4macro = $objExcel.Excel4MacroSheets.count + $objExcel.Excel4IntlMacroSheets.count
        $WorkBook.close($false)
        }
    }

    # WORD SECTION
    if(( $ext.substring(0,3).tolower() -eq ".do") -and (-not $zerosize)) {
        # collect word specific info (are there macros?)
        #full path is required to open the file
        write-host $f
        # is it password protected? (a dummy password will trigger

        # the error condition if a password exists, no error if no pwd
        try {
            $Doc = $objWord.documents.Open($f,$ConfirmConversions,$ReadOnly,$AddToRecentFiles,"a")
            } catch {
            $pwdprotected = $true
            }
        $error.clear()

        if((-not $pwdprotected) -and (-not $zerosize)) {
            # word macros?
            $hasmacro = $doc.hasvbproject
            $Doc.close($false)
            }
        }

    # add all info to the list
    $tempobj = [pscustomobject]@{
                fname = $f
                zone = $zone
                macro = ($hasmacro -or $hasxl4macro)
                PwdProtected = $pwdprotected
                ZeroSize = $zerosize
                LastAccessTime = $indfile.LastAccessTime.tostring()
                Owner = $indfile.GetAccessControl().Owner
                }

    $resultslist += $tempobj
    }

# Close out the two apps
$objWord.quit()
$objExcel.quit()


Now, with everything in a variable list, what "internet files" have macros?  (note - to export to a CSV file, use "export-csv" instead of "out-gridview")

$resultslist  | Where { ($_.zone -eq 3) -and ($_.macro -eq $true) } | out-gridview

or, if we're trying to just locate office files with macros:

$resultslist | where { $_.macro -eq $true } | out-gridview

Zero sized files?

$resultslist | where { $_.zerosize -eq $true } | out-gridview

You get the idea, with everything in hand, slice and dice as needed.  Or export to a CSV and use Excel as your "slicer and dicer" if you are more comfortable there.

Alternatively, if this is in a login script (so is run by each user, against their files as they log in), and your target is "$env:temp" then you might want to dump these to a CSV file, maybe based on userid and workstation name.  We're only interested if something is found, so there's a check for that in the "if" statement.
Finally, we want to run this as the user, so that we don't have to deal with monkeying with access rights to "temp" folders and so on.  In your login script, you'll want to bypass the execution policy (don't change the default though, you don't want to give folks the rights to run powershell accidentally):

powershell -ExecutionPolicy Bypass -file \\someserver\someshare\FindOfficeMacros.ps1

You'll want to modify the example script above to collect temp files and output any findings to a central location:

# at the begining of the script:
$targetshare = $env:temp

# ... main script goes here

# output section at the end ..

# be sure to terminate the share name with a "\"
$share = "\\someserver\someshare\"
$outfile = $share+$env:USERNAME+"-"+$env:COMPUTERNAME

# check so that we only output a file if we find something
$r = $resultslist | Where { $_.macro -eq $true }
if ($r.length -gt 0) { $r | export-csv $share+$env:COMPUTERNAME }


Again, this is more of a "concepts" blog - change things up to match your environment, just be sure that you ONLY open files as read-only.  Having powershell script MS office against all of your office files has some serious potential for damage - you can easily ransomware yourself (without the ransom possibility).

I did not collect the HostURL or ReferrerURL link variables for any files - that should be easy enough to add if you need that information.

If you've used this approach and found something interesting, please let us all know via the comment form! (NDA's permitting of course).

 

===============
Rob VandenBrink
rob@coherentsecurity.com

Keywords:
10 comment(s)

Comments

Hi,

before opening an Excel-file it needs to disable Makros, even when using Powershell:

(and "update links)


$Excel = New-Object -ComObject Excel.Application

$Pfad = 'C:\Users\xxxxx\Desktop\'
$file = 'Test.xlsm'

$Excel.AutomationSecurity = 3 # msoAutomationSecurityForceDisable

$WB = $Excel.Workbooks.Open($Pfad + $file, 0, 1)

write-host ('vbaProject: ' + $WB.HasVBProject)

$WB.close(0)
$Excel.quit()

regards
Good catch, thanks very much! I'll update the code in the post.
I'll also disable alerts ( $application.DisplayAlerts = "wdAlertsNone" )
Great work, but actually I continue to receive this error:

(Exeption HRESULT: 0x80010108 (RPC_E_DISCONNECTED))


Any ideas? Thanks.
This seems to be a popular issue (just googled it). It's a communications error, and like most RPC communications errors, either it's an obvious one (firewall up at the far end for instance), or it seems to be occurring for no good reason ... :-)

Anyway, I've seen a few forums where adding a simple 1 or 2 second wait worked this out - in some cases the scripting approach can be too fast for the application to keep up.
Try adding "sleep(1)" after the "$resultslist += $tempobj" line.

If that doesn't work, this is not an issue that I've seen, and I've run this in a few live environments. Can you be more specific on which line is generating the error?
Thanks a lot for your answer.
The sleep command didn't help unfortunately.

Here is the output (in German):

Das aufgerufene Objekt wurde von den Clients getrennt. (Ausnahme von HRESULT: 0x80010108 (RPC_E_DISCONNECTED))
In FindOfficeMacros.ps1:139 Zeichen:9
+ $WorkBook.close($false)
+ ~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : OperationStopped: (:) [], COMException
+ FullyQualifiedErrorId : System.Runtime.InteropServices.COMException


This error occurs many many times after processing a few files.
Can we take this offline to email? - I'm at rob@coherentsecurity.com
Drop me a line and we'll take things from there?
Actually, shoot me the OS version and patch levels at both ends in your email? If this works a number of times then starts to consistently fail, this might be an issue of RPC port exhaustion. See this MS doc: https://support.microsoft.com/en-us/help/935677/fix-error-code-0x800706ba-may-be-generated-when-a-client-computer-make
Thanks. E-Mail sent
Nice post Rob, thank you for sharing. I think it should also be possible to translate your hunting techniques into a YARA rule, maybe extending the publicly available ones (https://github.com/Yara-Rules/rules/blob/master/maldocs/). There are some (thanks to Didier Stevens, Florian Roth and others) that are already available for detecting VBA macros in office documents, and could be possible to extend them with filesize condition and dotnet module (capable to dial with file streams, event if I've never tested it before).
Of course, the use of YARA needs the engine to be installed, instead of using a builtin tool such as powershell, but can improve performance in detection and avoid the risk of inadvertently run an autopen macro while initializing the office object in case of a security setup error. This is only my private opinion.

Diary Archives