Thursday, June 04, 2015

Broken, Abandoned, and Forgotten Code, Part 7

In the previous post, I finished discussing the abCheckBoardID() function. I called attention to a checksum in the header generated by an unknown algorithm. I provided a python implementation of that algorithm ported from IDA disassembly. In total, I identified four fields parsed by this function, accounting for 30 bytes of the 58 byte header.

In this part I'll give an overview of the remaining functions that parse and validate the firmware header. By the end we will be able to generate a header that allows the firmware to be programmed to flash memory. I won't discuss each header field in quite as much detail as I did previously, but if you've made it this far, it shouldn't be too hard to understand how each field is used.

Updated Exploit Code

The update to the exploit code for Part 6 added a module to regenerate a checksum found in the header. This update populates a couple of additional checksums as well as a few other fields. The code provided for Part 7 is sufficient to generate a firmware header that will pass the web server's validation. Given a valid kernel and filesystem image, you should be able to generate a firmware image that the web interface will happily upgrade to. If you've previously cloned the repository, now would be a good time to do a pull. You can clone the git repo from:

Of Checksums and Sizes

After the abCheckBoardID() function (discussed in part 6) there are a few more functions that parse or validate portions of the header. Identifying these fields and their purpose is challenging due to the fact that values may be parsed out in one function, but not used until some other function or functions, if at all.

The two functions that parse out values from the header are upgradeCgi_setImageInfo() at 0x004356B0 and upgradeCgiCheck() at 0x004361F8. The "setImageInfo" function is a short one. It parses several header fields, but it doesn't inspect or use any of them. The values are stored in global variables for later use. You can identify offsets of these fields using string patterns as described previously. As you identify these locations where the parsed values are located, rename the variables in IDA to something more meaningful, so you can identify them later when they are used. I renamed them to correspond with the offsets they were parsed from.

Renaming global variables corresponding to header offsets.
The upgradeCgiCheck() function validates a few fields parsed out previously. At 0x004362BC we see the return of our friend, calculate_checksum(). This time the checksum is computed across more than just the firmware header. At the "update" step, the data argument points to the "HDR0" portion of the firmware. This suggests the checksum is across the TRX image that follows the 58 byte header. The size argument is the sum of the values found at offsets 24 and 28. Inspecting the values at those positions in a stock firmware, we see 0x00871000 at offset 24, and 0x0 at offset 28. It's clear that bytes 24 - 27 are the size of the firmware image minus the 58 bytes at the start. Based on its use here, the bytes 28 - 31 are also a size of some sort.

At any rate, the size passed to calculate_checksum() at the update stage at 0x004362DC is the size of the TRX image. At 0x0043630C, the checksum is compared to the value taken from offset 32. We now know three more fields in the firmware header: offsets 24, 28, and 32. That's 42 bytes down, 16 to go.

checksuming TRX image
Checksum of the firmware's TRX image.

We're not done with checksums just yet. The basic block at 0x0043643C is another checksum operation. Once again the data points to "HDR0", but the size is only the value from offset 24. The size from offset 28 is not used this time. The checksum result is the same as before, but this time compared to the value at offset 16. We now know the checksum we compute and store at offset 32 must also be stored at offset 16.

At this point we can speculate this firmware format supports multiple partitions or sections. The value at offset 24 would be the size of partition 1, and offset 28 would be the size of partition 2. The checksum at offset 16 would be calculated over partition 1, and offset 32's checksum would be calculated over partitions 1 and 2 combined.

We're now down to 12 unidentified bytes. Let's have a look at an updated header diagram to see how things look.

header diagram 2
What we know so far about the firmware header.

The diagram is starting to fill in, and things are looking quite a bit better.

Version String

Moving on, at 0x00436580, more data is parsed out of the firmware image. This time the values are pulled out one byte at a time. This frustrates the technique of using the 3+ byte patterns to identify offsets. Based on the format strings from subsequent sscanf() and sprintf() operations, we can speculate that these values are transformed in some way into the version string displayed in the web interface.

Although the version string ends up being only cosmetic, and not an essential part of the firmware validation, it's still interesting enough to discuss here. Modifying the version string would be a nice way to visually demonstrate that the target is, in fact, running your custom firmware, and not the stock firmware.
[Update: Turns out this isn't quite right. There is a string table stored in flash memory that also contains the version string, and that string is displayed in the web interface. The version field in the firmware header is only (as far as I can tell) rendered during the update process so the user can see what version they're updating to.]

It took some debugging, but it turns out the single byte values that compose the version string don't actually get used until a few functions later, in upgradeCgi_GetParam() at 0x00436B4C.

generating firmware version string

What is happening here is a version string is being generated to display in the web browser so that the user can confirm what version of the firmware they're about to upgrade to.

Firmware Version String

The version string "V65.97.51.65_97.52.65" from the screenshot above appears to be composed of the decimal representations of ASCII characters from Bowcaster's pattern string. We can be sure by replacing bytes 8 - 15 with a string of non-repeating characters: "stuvwxyz". When we do this, the version string becomes "V116.117.118.119_120.121.122". This confirms the hypothesis; these are the decimal representations for t,u,v,w,x,y, and z. Note that "s" is not included. Even though byte 8 was parsed out along with the rest, it appears to go unused.

Firmware Version String 2

We can now update the header diagram to reflect the version bytes.
header diagram 3

(Mostly) Complete Firmware Header

The header diagram now has only 4 bytes (5 if you count the unused version byte at offset 8) that haven't been identified. It's unclear what these bytes are for, since they are never inspected. A likely explanation is that a checksum for theoretical partition 2 belongs at offset 20. The stock firmware has 0x0 at offset 20, which jives with a partition 2 size of 0. At any rate, this header is sufficient for execution to reach the point where the uploaded firmware gets written to /dev/mtd1.

WARNING: If you are debugging httpd on on actual hardware rather than in emulation, there's a chance your router will end up bricked if you attempt to upgrade to a customer firmware image. Eventually, we must test on actual hardware, but before then, I'll describe how to access the device's serial console using a UART to USB cable. Using the serial console, you can recover from a bad firmware update, a feature I had to use many times during my original research.

In the next part, with a better understanding of the firmware format, we'll loop back to the UPnP daemon and pick up where we left off there. Wouldn't it be nice if we could use the now documented header format to generate a firmware that will work with the UPnP daemon using our existing exploit code?