Protected OOXML Spreadsheets

    Published: 2024-07-15
    Last Updated: 2024-07-15 04:54:57 UTC
    by Didier Stevens (Version: 1)
    0 comment(s)

    I was asked a question about the protection of an .xlsm spreadsheet. I've written before on the protection of .xls spreadsheets, for example in diary entries "Unprotecting Malicious Documents For Inspection" and "16-bit Hash Collisions in .xls Spreadsheets"; and blog post "Quickpost: Remove Sheet Protection From Spreadsheets".

    .xlsm spreadsheats (and .xlsx) are OOXML files, and are thus ZIP files containing mostly XML files:

    The spreadsheet I'm taking as an example here, has a protected sheet. Let's take a look at the XML file for this sheet by piping's output into

    XML element sheetProtection protects this sheet. If you remove this element, the sheet becomes unprotected.

    The password used to protect this sheet, is hashed and the hashvalue is stored as an attribute of element sheetProtection.

    Let's print out each attribute on a different line:

    The password is hashed hundred thousand times (attribute spinCount) with SHA-512 (attribute algorithmName) together with a salt (attribute saltValue, base64 encoded). This result is stored in attribute hashValue (base64 encoded).

    Here is the algorithm in Python:

    def CalculateHash(password, salt):
        passwordBytes = password.encode('utf16')[2:]
        buffer = salt + passwordBytes
        hash = hashlib.sha512(buffer).digest()
        for iter in range(100000):
            buffer = hash + struct.pack('<I', iter)
            hash = hashlib.sha512(buffer).digest()
        return hash
    def Verify(password, salt, hash):
        hashBytes = binascii.a2b_base64(hash)
        return hashBytes == CalculateHash(password, binascii.a2b_base64(salt))

    Spreadsheet protected-all.xlsx is a spreadsheet I created with 3 types of protections: modification protection, workbook protection and sheet protection:

    I released a new version of to extract these hashes and format them for hashcat:

    For each extracted hash, the lines are:

    1. the name of the containing file
    2. the name of the protecting element (which can be removed should you want to disable that particular protection)
    3. the hashcat compatibel hash (hash mode 25300)
    4. a hashcat command to crack this hash with a wordlist

    You can imagine that cracking these hashes with hashcat is rather slow, because 100,000 SHA-256 hash operations need to be executed for each candidate password. On a desktop with a NVIDIA GeForce RTX 3080 GPU, I got around 24,000 hashes per second.

    Didier Stevens
    Senior handler

    0 comment(s)
    ISC Stormcast For Monday, July 15th, 2024


      Diary Archives