Parsing Binaries With Kaitai Struct
Kaitai Struct is a general-purpose declarative language for describing binary data structures. With it we can parse binary file formats, in-memory data structures, network packets, etc.
The target format to be parsed is first described in the Kaitai Struct language (KSY) and then compiled to source files that can be imported as libraries in one of the several programming languages it supports, such as Python, C++, Java, Lua, JavaScript, etc. The resulting file(s) from the compilation, together with the specific bindings for the chosen language, will allow us to easily access the parsed binary format fields and structures.
Quick introduction to the KSY format
The full user guide to the KSY format can be found on the Kaitai website. We’ll only outline the (simplified) very basics here:
KSY is a YAML-based format describing types and structures, distributed in several sections:
meta:
Contains metadata about the target binary format we are parsing, such as identifiers or the default endianness
seq:
Describes an ordered sequence of elements (attributes), such as the element identifier, type and size (or literal contents, e.g. magic numbers).
enums:
Maps integer constants to symbolic names for clarity, which can then be referenced using the enum
key.
types:
Declares user-defined named types, each of which can contain any of the elements above, including other types
elements.
instances:
Describes structures that lie outside of normal sequential parsing flow or just need to be loaded only by special request.
Practical example: ESP8266 firmware image
To demonstrate how an arbitrary binary format can be described with the KSY format we will use the ESP8266 firmware image header format. A brief file format description by the manufacturer (Espressif) can be found here:
The firmware file consists of a header, a variable number of data segments and a footer. Multi-byte fields are little-endian.
However, that’s not entirely accurate. As it turns out, a firmware image can contain multiple sections, each one containing a header, one or more segments and a footer, as depicted below:
For example, when an image supports OTA upgrades (which is pretty common these days) a bootloader must be flashed in the first 64KB of the memory.
Each section header contains the necessary information to map the segment(s) it contains into RAM (see the memory layout). Right after the section header, the first segment starts, with a small header containing the memory offset where it should be mapped and its size. If there were more segments present, they would be laid out immediately afterwards.
When the last segment ends, the whole section is padded with zeros until its size is one byte less than a multiple of 16 bytes. A last byte (thus making the section size a multiple of 16) is the checksum of the data of all segments. The checksum is defined as the xor-sum of all bytes and the byte 0xEF.
Kaitai Struct Language
As we saw in the section above, a firmware image is made up of sections
. First, we describe the section header, which always starts by a magic number followed by a number of fields:
section_header:
seq:
- id: magic
contents: [0xE9]
- id: num_segments
type: u1
- id: flash_interface
type: u1
enum: e_spi_flash_mode
- id: flash_size
type: b4
enum: e_flash_size
- id: flash_speed
type: b4
enum: e_flash_speed
- id: entrypoint
type: u4
enums:
e_spi_flash_mode:
0: qio
1: qout
2: dio
3: dout
e_flash_size:
0: size_512k
1: size_256k
2: size_1m
3: size_2m
4: size_4m
e_flash_speed:
0: speed_40mhz
1: speed_26mhz
2: speed_20mhz
0xf: speed_80mhz
We use the contents
key to check for the correct value (0xE9) at the segment start. Both num_segments
and flash_interface
are single unsigned bytes so we use u1
. Same thing for flash_size
and flash_speed
, which are 4 bits each: b4
. Since the flash size and speed are named integer constants, we can use the enum
type for clarity.
Now, a segment has three different fields: the memory offset where it should be located, its size, and the actual segment data:
segment:
seq:
- id: memory_offset
type: u4
- id: segment_size
type: u4
- id: data
size: segment_size
We can just back-reference the segment_size
field using the size
key, and Kaitai will do all the heavy lifting when we compile our KSY file to our target programming language!
The last part of a section is the footer, which is just a zero padding to 16 bytes, the last byte being the checksum of all segments on that section:
section_footer:
seq:
- id: padding
type: u1
repeat: expr
repeat-expr: 15 - (_io.pos % 16)
- id: checksum
type: u1
Now, we can easily glue all three parts together (header, segments, footer) as follows:
section:
seq:
- id: header
type: section_header
- id: segments
type: segment
repeat: expr
repeat-expr: header.num_segments
- id: footer
type: section_footer
In a similar way as we did in segment
, we reference the num_segments
value from the header
field to be used as the repeat-expr
target, that is, Kaitai will read as many elements as instructed by that value.
Inspecting and parsing a binary file
The Kaitai Web IDE is a handy tool when it comes to inspect our target, because it allows us to write the KSY file and watch how the different fields and structures we declare get parsed in real-time. We will use the Tasmota firmware for the ESP8266-based Sonoff devices.
As we mentioned before, ESP8266 firmware images can contain more than one section
; in this case a bootloader lives in the first 64kB, followed by the rest of the code and data:
seq:
- id: bootloader
type: section
size: 0x1000
- id: code
type: section
And that’s it! If we load both our target file and KSY code in the Web IDE, we can see how the parsing gets done:
Compiling the KSY source to a Python class
Now that we have a fully functional KSY source, we can compile it to any supported language of our choice.
We will use Python to demonstrate how it works:
kaitai-struct-compiler --target python esp8266-image.ksy
The command above will generate an esp8266_image.py
file containing a Esp8266Image
class which will allow us to parse and manipulate all the fields we described from a Python script:
from esp8266_image import *
# Instantiate Esp8266Image object
target = Esp8266Image.from_file("sonoff-basic.bin")
# Function to print several values from a Section object
def printSectionInfo(section):
print(f"[+] Flash size: {section.header.flash_size}")
print(f"[+] Flash speed: {section.header.flash_speed}")
print(f"[+] Entrypoint: {hex(section.header.entrypoint)}")
print(f"[+] Num. segments: {section.header.num_segments}")
for i, segment in enumerate(section.segments):
print(f" |----[Segment {i}]")
print(f" |----> Offset: {segment.memory_offset}")
print(f" |----> Size: {segment.segment_size} bytes")
print(" .")
# Print bootloader section info
print("nBootloader")
print("----------")
printSectionInfo(target.bootloader)
# Print code section info
print("nCode")
print("----")
printSectionInfo(target.code)
The simple code above will look like the following when executed:
Bonus: compiling to graphviz
It’s also possible to compile the KSY file to graphviz as follows:
kaitai-struct-compiler --target grahpviz esp8266-image.ksy
The resulting file then can be rendered with any graphviz engine such as dot, producing images similar to the following:
Full KSY file
meta:
id: esp8266_image
file-extension: esp8266_image
endian: le
seq:
- id: bootloader
type: section
size: 0x1000
- id: code
type: section
types:
section:
seq:
- id: header
type: section_header
- id: segments
type: segment
repeat: expr
repeat-expr: header.num_segments
- id: footer
type: section_footer
section_header:
seq:
- id: magic
contents: [0xE9]
- id: num_segments
type: u1
- id: flash_interface
type: u1
enum: e_spi_flash_mode
- id: flash_size
type: b4
enum: e_flash_size
- id: flash_speed
type: b4
enum: e_flash_speed
- id: entrypoint
type: u4
enums:
e_spi_flash_mode:
0: qio
1: qout
2: dio
3: dout
e_flash_size:
0: size_512k
1: size_256k
2: size_1m
3: size_2m
4: size_4m
e_flash_speed:
0: speed_40mhz
1: speed_26mhz
2: speed_20mhz
0xf: speed_80mhz
segment:
seq:
- id: memory_offset
type: u4
- id: segment_size
type: u4
- id: data
size: segment_size
section_footer:
seq:
- id: padding
type: u1
repeat: expr
repeat-expr: 15 - (_io.pos % 16)
- id: checksum
type: u1