Unpacking HP Firmware Updates – Part 4

Part 4 — Memory map leads us to our destination

Andrey Zagrebin, Moshe Kol, Shlomi Oberman

This post is the forth and final of a four-part blog series documenting the different structures and stages of the firmware update.

In the previous post we detailed the flash layout and the sliding window compression used to store memory sections on-disk.
We now have a raw flash image on our hands.

The application loader

At this point, we have preloaded the last few code sections, decompressing some of them, followed by general decompressing. We’re done and ready to look at the application code. right?

Don’t get excited yet. The path to enlightenment is almost as long as the path to HP firmware unpacking.

long road

Following the newly loaded code, we reach yet another indirect call. From the debug strings around the call opcode, it looks like this is the entry point of the printer application code:

post4_application_entry_call.PNG

This entry point ultimately comes from decoding a structure already loaded into memory:

The apphdr struct

For reasons apparent later, we refer to this structure as the application header or apphdr, while referring to the code using it as the applicationloader (or app loade).

The address 0x4fffc0004 is the start of this structure, and at 0x4fffc038 we find the entry point, 0x4145a9b4+1. This address is once again in a not-yet-initialized part of RAM. Reverse-engineering the function that parses the application header, we learn valuable implementation details, presented in the following paragraphs.

One of the first operations in the app loader is displaying the bootsplash bitmap picture. This picture is identical to the one found in the Flash image before loading the firmware to RAM.

Next, the application loader again performs memset, memcpy, and decompression operations on chunks of memory. Curiously, both the pre-loader and application loader have their own copies of these functions rather than sharing one set. This duplication suggests a possibleorganizational barrier between the pre-loader and app loader software development teams.

This time though, instead of using hardcoded arguments, the app loader invokes these functions in a loop, reading sets of parameters from memory pointed to indirectly by the app header.

Here’s an example of decompilation of that part of the app loader that invokes memcpy in bulk:

The memcpy part of the app loader

Notice the verify_address function. This function checks whether the address range written to indeed overlaps so-called “protected ranges”. A protected range is a range of memory addreses that will not be overwritten, even if a section is marked for loading at an address that overlaps with that range. If there is an overlap, the loader does not invoke the relevant memcpy, memset or uncompress for that section. To check whether an address range is protected, the loader compares the range against 0x1a pairs of starting and ending addresses of protected memory ranges. The array of pairs of addresses is also pointed to by the apphdr. We’ll discuss why these ranges are so special when we discuss the different memory sections.

The memcpy parameters are stored as an array of triplets of the form:

Offset Length Type Description
0 4 void* dest – the start address of the block to initialize
4 4 void* src – the start address of the block to read from
8 4 size_t num – number of bytes to set (size of blocks)

And similarly for memset:

Offset Length Type Description
0 4 void* addr – the start address of the block to initialize
4 4 int value – the byte value to set
8 4 size_t num – number of bytes to set (size of block to initialize)

Note: Although the second argument represents a byte value, memset expects an int, which is recast internally to a byte, consistent with the libc version of memset.

And uncompress:

Offset Length Type Description
0 4 void* dest – the start address of the block to initialize
4 4 void* src – the start address of the block of compressed data
8 4 size_t compressed_size – size of the compressed block

There is quite a lot more code in the application loader, but we need only focus on the code that relates to loading those sections into memory required to achieve our goal, which is to reverse engineer the firmware and find security vulnerabilities.

In the end, execution is passed to the app_entry function, pointed to by the apphdr.

Application header structure

All the parameters related to the application loader reside in the application header and the memory it points to. Let’s go through the important members of the application header structure:

(“Offset” means the decimal offset from structure start. We omit irrelevant and unknown fields.)

Offset Value Name Description
0 0x3ca55a3c magic Checked before the stucture is used
4 0x6c size Total size of the struct in bytes
8 0x0461090d more_magic_1 See notes
12 0xfb9ef6f2 more_magic_2 See notes
20 0x4e0b0000 bootsplash_bmp bootsplash_bmp is a pointer to the bootspalsh bitmap image (BMP file format). This appears to be the same picture as the one found on the flash image before the code that is loaded to RAM
52 0x4145a9b5 entry_point Pointer to the application entry point
56 0x4fffc000 protected_count Pointer to a 32-bit integer counting the number of protected memory ranges
60 0x4fffc070 protected_addresses Pointer to pairs of (start, end) protected memory ranges
64 0x4e10fcc0 section_linked_list Pointer to a linked list of memory section descriptors
72 0x4e10fa68 memset_list_start Start of the list of memset parameter triples
76 0x4e10fad4 memset_list_end End of the list of memset parameter triples
80 0x4e10fad4 copy_list_start Start of the list of memcpy parameter triples
84 0x4e10fbdc copy_list_end End of the list of memcpy parameter triples
88 0x4e10fbb8 copy_list_barrier See notes
92 0x4e10fbdc uncompress_list_start Start of the list of uncompress parameter triples
96 0x4e10fcc0 uncompress_list_end End of the list of uncompress parameter triples
100 0x4e10fca8 uncompress_list_barrier See notes

Notes:

  1. All fields are 32 bits (4 bytes) long
  2. The purpose of the two more_magic fields is not clear; we conjecture they might be a version id or some kind of bitmask. Interestingly, their two values are bitwise complements of one another. Both values, except for the most significant nibble, are checked before reading from the apphdr. Technically, each value is masked with 0x0fffffff and tested against 0x0461090d and 0x0b9ef6f2.
  3. The copy_list_barrier field points to the middle of the memcpy parameter list, and is not used in this implementation of the loader. It may indicate that the values before this point have a different purpose than those following. uncompress_list_barrier points to the middle of the uncompress parameter list in much the same way.

Memory sections and their descriptors

As briefly mentioned above, the apphdr has a field (section_linked_list) pointing to a linked list of memory section descriptors. The app loader code does not seem use it. However, it contains information about the structure of the printer’s memory, including section names, which may aid us in loading and reverse-engineering of the firmware.

The field section_linked_list points to the first element of this list and each element consists of the following members:

Offset Type Name
0 memory_section* next
4 char* section_name
8 void* start_addr
12 size_t size
16 uint? unknown
20 memory_section * dest_section

All members are 32-bit (4 bytes) long.

Following is a description of the element members:

  • next: Pointer to the next element of the linked list.

  • section_name: Pointer to a null-terminated string containing the section name

  • start_addr: The starting address of the section

  • size: The size of the section in bytes

  • unknown: The purpose of this field was not researched. It could contain Information about the section type or various flags (e.g., rwx (“read-write-execute”) permissions) Values observed were: 1, 2, 4, 0xa, 0xc

  • dest_section: If this section is used to initialize another section (e.g. it is the source of a memcpy or uncompress operation), this field holds a pointer to the destination section descriptor. Otherwise, it is NULL.

    This field points to the descriptor (i.e., linked-list element) and not to the start of the section in memory.

Example of two entries:

[0x4e110504] .cromtext:
    next: 0x4e110528 (.crommodule section descriptor)
    section_name: 0x4e11051c (".cromtext")
    start_addr: 0x4f522fd0
    size: 0xa25a6c
    unknown: 0x1
    dest_section: 0x4e110b78 (.text section descriptor)

[0x4e110b78] .text:
    next: 0x4e110b98 (.module section descriptor)
    section_name: 0x4e110b90 (".text")
    start_addr: 0x4036800c
    size: 0x116655c
    unknown: 0x1
    dest_section: 0x0

In this example, .cromtext has a non-zero dest_section (0x4e110b78). As expected, the .cromtext section is decompressed and loaded to the .text section (address 0x4e110b78) by the app loader.

Some examples of the contents of memory sections include:

  • The .load_apphdr section: The section is constructed as follows:

    Size Description
    4 bytes Protected memory entries count (0x1a)
    0x6c bytes The apphdr struct itself
    0xd0 bytes Protected memory entries (Pairs of 32-bit addresses. 0x1a*8=0xd0 bytes)
  • The .secinfo section contains the parameter triples for the memset, memcpy, and decompress functions, elements of the section descriptor linked list, and the section names as null-terminated strings.

Insights

Now that we can associate memory address ranges with sections, we can reach some interesting conclusions:

  • the memory sections do not overlap.
  • the protected areas of memory include the following named sections:
.load_text
.load_rodata
.boot_ncdram_hole (empty section)
.load_ncdata (empty section)
.load_data
.load_ncbss
.load_cgdbuf
.load_bss
.nosi_text
.nosi_rodata (empty section)
.nosd_data (empty section)
.nosd_bss
.startup_text
.startup_rodata
.startup_data (empty section)
.startup_bss
.stack
.erom_support_2
.secinfo

These are mostly sections that are critical for running the app loader, and include sections that were initialized by the pre-loader.

  • The memcpy/memset/uncompress parameters correspond to entire memory sections and do not overlap.
  • All the sections that need initialization have corresponding parameters in one of the memcpy/memset/uncompress lists — even those that have been initialized by the pre-loader and are part of the protected ranges.

The last conclusion is encouraging. If we know the address of the apphdr structure, we can have a loader script parse it and initialize the uninitialized memory automatically. The initialization includes those sections with hardcoded addresses in the pre-loader code. The magic 4-byte number at the start of the apphdr structure is unique, and can be used to find the structure.

So, are we done yet? Every path has an end, and we have finally reach ours. Once here, we discover that “Accomplishments will prove to be a journey, not a destination” (Dwight D. Eisenhower).

Let’s take a moment to remember all the stages of the firmware we had to unpack and decode to reach this point:

  • In post no. 1, we unpacked the PCL format, including a proprietary extension and extracted data encoded as a raster graphics image.

  • In post no. 2, we encountered S-records, and all they wanted to do was parse some S-records. We dealt with a proprietary S-record binary variation along the way.

  • In post no. 3, we started looking at the code and discovered that it is self-modifying and staged, with each part loading the next into memory. We also got a crash lesson in sliding-window compression 101.

  • In post no. 4, we uncovered the app header structure, saw how the different sections of code and data are loaded into memory, and started to see the light at the end of the tunnel. We also finished this blog post series.

What should we do next week?

Be on the lookout for our upcoming announcement on June 16th, when we announce the security findings for which we performed all of this initial research of firmware unpacking.

References

  • No references used for this post

Thank you

Moshe Rubin and Daniel Goldberg for proofreading

Get our posts to your Email

Subscribe to our mailing list