Threat Level: green Handler on Duty: Rob VandenBrink

SANS ISC: Quickie: Parsing XLSB Documents - SANS Internet Storm Center SANS ISC InfoSec Forums

Participate: Learn more about our honeypot network
https://isc.sans.edu/honeypot.html

Sign Up for Free!   Forgot Password?
Log In or Sign Up for Free!
Quickie: Parsing XLSB Documents

Inspired by Xavier's diary entry "XLSB Files: Because Binary is Stealthier Than XML", I took a look at Microsoft's XLSB specification.

This confirmed my hopes: the binary format of XLSB files is a sequence of TLV records, just like BIFF. At least for sheets and shared string tables, I haven't looked at the other file formats yet.

The type and length of each TLV record is a variable length integer: from 1 to 2 bytes (type) and from 1 to 4 bytes (length). It's stored in little-endian format, and the least significant bytes have all their most significant bit set. The most significant byte has its most significant bit cleared. 7 least significant bits are used to encode the integer value. This implies that the highest value for a type integer is number 16383.

I wrote a simple parser, it is still in beta: xlsbdump.py.

Didier Stevens
Senior handler
Microsoft MVP
blog.DidierStevens.com

DidierStevens

638 Posts
ISC Handler
Mar 30th 2022

Sign Up for Free or Log In to start participating in the conversation!