IoT Firmware Emulation with Qemu
Emulation of three Linux based devices
While working on a case study, we needed to emulate various firmware samples with their included web service. In this blog post, we are going to take a look at some of them. In addition, a primer on how to approach the analysis of such firmware samples is given.
LevelOne WBR-6013 N300
This sample was the first to be emulated and was the easiest of the three samples in this article. In retrospect, this sample was just a first contact with firmware emulation since FirmAE can already be used to emulate the sample and its corresponding web service. Nevertheless, we wanted to include it in the write-up as it gives valuable insight into how our approaches changed as we emulated other firmware.
Everything-is-a-file
We started by extracting the rootfs with the help of binwalk
. Afterward, an ext4 image got created so we can boot the filesystem with qemu
.
|
|
We used a Linux 5.11 kernel configured for the malta board. The firmware can subsequently be started with qemu-system-mips
.
|
|
When starting qemu
, the sample seems to emulate correctly. However, when studying the system’s behavior, it can be observed that the web server doesn’t get started. At this point (as we didn’t have much experience in this field), we tried to see if maybe an automatic emulation framework, such as FirmAE, could bring the web service up. To our surprise, it worked, and therefore we started to wonder how the image generated by FirmAE differs from ours. As comparing the two filesystems didn’t yield much progress, we instead compared the generated bootlogs.
Read hw setting header failed!
flash_mib_compress_write DEFAULT_SETTING DONE, __[flash.c-6766]
flash_mib_compress_write CURRENT_SETTING DONE, __[flash.c-6831]
DEFAULT_SETTING hecksum_ok
CURRENT_SETTING hecksum_ok
__[flash.c-6985]Generated PIN = 05141067
When booting, FirmAE’s filesystem tries to generate some kind of pin. This key generation appears to be missing in our bootlog. While examining the image generation scripts of FirmAE, it stood out that their approach often includes the creation of new devices. Therefore, we assumed that the key generation must have something to do with these missing files. To test this, we implemented a script that tries to boot the image after deleting a file in /dev
. If the pin 05141067
seems to be missing in the bootlog, we break and print the path to the deleted file. If not, we delete another one and try to boot it again.
|
|
After executing, the script returns error /dev/mtdblock0
. We verified our image, and it appeared that this file was indeed missing in our constructed filesystem. Thus we created a device with the following configuration:
sudo mknod -m 644 ./dev/mtdblock0 b 31 0
However, after booting the firmware our bootlog still didn’t include the pin. After reading more of FirmAE’s source code and log files, we came across some interesting messages during boot:
[ 0.653828] Creating 11 MTD partitions on "NAND 128MiB 1,8V 8-bit":
[ 0.654311] 0x000000000000-0x000000100000 : "NAND simulator partition 0"
[ 0.656826] 0x000000100000-0x000000200000 : "NAND simulator partition 1"
[ 0.657738] 0x000000200000-0x000000300000 : "NAND simulator partition 2"
[ 0.658473] 0x000000300000-0x000000400000 : "NAND simulator partition 3"
[ 0.659171] 0x000000400000-0x000000500000 : "NAND simulator partition 4"
[ 0.659857] 0x000000500000-0x000000600000 : "NAND simulator partition 5"
[ 0.660546] 0x000000600000-0x000000700000 : "NAND simulator partition 6"
[ 0.661412] 0x000000700000-0x000000800000 : "NAND simulator partition 7"
[ 0.662281] 0x000000800000-0x000000900000 : "NAND simulator partition 8"
[ 0.662983] 0x000000900000-0x000000a00000 : "NAND simulator partition 9"
[ 0.663748] 0x000000a00000-0x000008000000 : "NAND simulator partition 10
In addition, we found the corresponding qemu argument used by FirmAE:
nandsim.parts=64,64,64,64,64,64,64,64,64,64
The bootargs nandsim.parts
seem to be part of the nandsim
kernel module. FirmAE’s custom kernel includes the module, while our 5.11 kernel does not. This may be the reason why our sample doesn’t work as expected, as the created device isn’t usable as an mtd. While we could re-compile our kernel to include the module, a simple yet effective idea came to our minds: What if we just created a text file that behaves like a memory technology device? Keep in mind, in Linux everything is (considered as) a file. Meaning that, due to the abstraction layer the kernel offers, the syscalls used to read and write to a file might be the same as the ones used to interact with mtd’s in /dev
. Therefore a simple touch mtdblock0
could be sufficient. And indeed, after creating this file, our firmware sample boots successfully, and the webserver is up. Now one last step is needed, adding a tun device so we can access our emulated device over the network.
|
|
D-Link DIR-809
This firmware was the most complex of the three. At some point, we even thought that we might not be able to emulate the sample to its full extent. As you will see, the sheer amount of possible error sources made the outcome of successful emulation unlikely.
mini_httpd
After the initial filesystem extraction and image generation, we took a look at the files found in /etc/init.d/
. It can be seen that during boot, the script rcS
executes daemon.rc
, which is responsible for starting the webserver.
/usr/sbin/mini_httpd -d /usr/www -c '/cgi-bin/*' -u root -S -E /var/mini_httpd.pem -T utf-8
After starting the emulation, a quick ps
shows that mini_httpd
is indeed running. However, when trying to access the service, the binary crashes.
do_page_fault(): sending SIGSEGV to mini_httpd for invalid read access from 0a4df1d8
epc = 779e800c in libuClibc-0.9.28.so[779c0000+3a000]
ra = 779e7fac in libuClibc-0.9.28.so[779c0000+3a000]
As the message shows that the error is happening in libuClibc-0.9.28.so
, we thought that the uClibc used on the target may be compiled with kernel headers that our newer kernel may not understand. We concluded that this could be a recurrent problem and that in the future it may be necessary to change the uClibc with one that works with our Kernel version. For now, as replacing the uClibc didn’t seem trivial, we statically compiled a mini_httpd
binary instead. As a result, the webserver is now functional and after accessing it, an error message gets revealed.
--Internal Error: get::Send Get Val msg Error)
--Internal Error: Get::Get obj val Failed!
mtd0
From the arguments used to start the webserver, it can be concluded that the CGI is located under /usr/www/cgi-bin/
and that the binary used to handle requests is webproc
. For further analysis, the webserver is not required. Therefore, changing into /usr/www/cgi-bin
and interacting with webproc
directly is viable. When executing the binary, a previously unknown error message can be observed.
|
|
The error message Faild to open /dev/mtd0
indicates that the binary would like to access mtd0. From FirmAE we learned that nandsim can be used to simulate flash memory. Therefore, we recompiled our kernel to include the module and created some partitions:
modprobe nandsim first_id_byte=0xc8 second_id_byte=0xd1 third_id_byte=0x80 fourth_id_byte=0x95 cache_file=/root/nandsim.bin parts=8,8,8,20,20,256,256,224,220
Verifying the behavior with strace
shows an ioctl call using the fd of /dev/mtd0
:
open("/dev/mtd0", O_RDONLY) = 5
ioctl(5, _IOC(_IOC_READ|_IOC_WRITE, 0x47, 0x2, 0x5), 0x7f75ed10) = -1 ENOTTY (Not a tty)
write(1, "Faild to get item.\n", 19Faild to get item.) = 19
As the error message persisted, we concluded that mtd0 doesn’t have the correct data needed by webproc
. At this stage, it seemed like without the original data for the mtd partitions, webproc
wouldn’t be functional. On the bright sight, when we configured nandsim, the default webserver worked without crashes. This makes our statically compiled mini_httpd
obsolete.
Device Safari
To progress, we ordered a D-Link DIR-809. After receiving the device, we soldered a UART interface on the JTAG to acquire console access. Having console access, the next step was to extract data from the mtd partitions. However, to replicate the mtd behavior, changing nandsim’s configuration to fit the layout of the physical device is needed. Without the right size
, erasesize
and blocks
count, read and write syscalls could result in different data than on the physical device.
# cat /proc/mtd
dev: size erasesize name
mtd0: 00030000 00010000 "boot"
mtd1: 001e0000 00010000 "kernel"
mtd2: 005d0000 00010000 "rootfs"
mtd3: 00020000 00010000 "multi_lang"
# cat /proc/partitions
major minor #blocks name
31 0 192 mtdblock0
31 1 1920 mtdblock1
31 2 5952 mtdblock2
31 3 128 mtdblock3
The simplest way to accomplish this is by figuring out which chip is running on the system. By knowing the name of the memory chip, its datasheet can be consulted for its identification bytes. These bytes can later get utilized by nandsim
and its *_id_byte
option. Luckily, the bootlog included the following string: 8 MB MX25L6405D at mode 1
. However, nandsim
doesn’t seem to support the MX25L6405D
chip, and thus, we replicated the partition layout via trial and error instead. Subsequently, we extracted the content of each mtdblock
by running cat /dev/mtdblockX > dumpX
and imported the data on the emulated device with the help of nandwrite
. Next, we verified the hashes of the data, and tried starting webproc
again. Despite all the work done, the binary still returned the same errors.
To make more sense of this, we opened webproc
with strace
and ghidra
:
|
|
The Ghidra generated code shows that there aren’t any read or write syscalls happening on the fd of the mtd. Instead, an ioctl syscall gets used. Studying the output of strace
shows that this ioctl fails. We concluded that this problem occurs because our kernel doesn’t have the module to communicate to an MX25L6405D
chip. This issue was already observable at the beginning of our journey, where we used strace
to see if our nandsim
configuration worked.
open("/dev/mtd0", O_RDONLY) = 5
ioctl(5, _IOC(_IOC_READ|_IOC_WRITE, 0x47, 0x2, 0x5), 0x7f75ed10) = -1 ENOTTY (Not a tty)
write(1, "Faild to get item.\n", 19Faild to get item.) = 19
logic
Afterward, we tried getting the web service up by tracing function calls on the physical device and hooking these functions on the emulated device such that the behavior of the physical system can be replicated. However, as statically compiling ltrace
is not trivial and manually tracing function calls with gdb
is laborious, the approach got quickly abandoned.
We wondered if we missed something essential and went back to study the error message shown on the webpage.
--Internal Error: get::Send Get Val msg Error)
--Internal Error: Get::Get obj val Failed!
By comparing the output from strace
of the physical system and the emulated device, we inferred that the message may have something to do with interprocess communication.
#physical device
socket(AF_UNIX, SOCK_DGRAM, 0) = 4
bind(4, {sa_family=AF_UNIX, sun_path="/var/pid/0x05"}, 110) = 0
chmod("/var/pid/0x05", 0777) = 0
sendto(4, "\0\0\0\0\1\0O\0\1\5\2\4\0\0\0\0>\0\0\0?\0\0\0\4\0\0\0sess"..., 87, MSG_DONTWAIT, {sa_family=AF_UNIX, sun_path="/var/pid/0x04"}, 110) = 87
#emulation
socket(AF_UNIX, SOCK_DGRAM, 0) = 3
bind(3, {sa_family=AF_UNIX, sun_path="/var/pid/0x05"}, 110) = 0
chmod("/var/pid/0x05", 0777) = 0
sendto(3, "\0\0\0\0\1\0O\0\1\5\2\4\0\0\0\0>\0\0\0?\0\0\0\4\0\0\0sess"..., 87, MSG_DONTWAIT, {sa_family=AF_UNIX, sun_path="/var/pid/0x04"}, 110) = -1 ECONNREFUSED
In addition, we noticed on the physical device a running process called logic
. As we’ve never seen that process before, we examined our emulated device and noticed that this process is indeed missing. As we knew from our analysis that logic
is a child of pc
and that the parent was running, we deduced that logic
gets terminated. Therefore, we fired up ghidra
and searched for sections of code that exits the program. Also, during our ghidra
session the string nRet == TBS_SUCCESS
caught our eye. This string can also be found in the bootlog, meaning that logic gets at least started at some point.
|
|
At the main
function it can be seen that logic
tries to start different modules (ex. WEBP_ModuleInit()
, DDNS_ModuleInit()
) and checks if the initialization succeeded. If that isn’t the case an assert
is called which leads to the termination of the binary. If all modules started successfully, the process goes into a while loop, enabling process communication with MSG_ReceiveMessage()
. As our emulated firmware couldn’t initialize some of these modules, we tried patching the binary so that it always reaches the last part independent of the successful start of the modules.
And to our surprise, it worked! We started webproc
and the web service was working.
Foscam C1 HD 720P
The firmware we received, was a flash dump of the physical device. With the memory layout found by Felipe Astroza Araya, we extracted each partition with dd
.
dd if=flash_dump of=mtd0 skip=0 bs=512 count=1024
dd if=flash_dump of=mtd1 skip=1024 bs=512 count=6144
dd if=flash_dump of=mtd2 skip=7168 bs=512 count=22528
dd if=flash_dump of=mtd3 skip=29696 bs=512 count=2048
dd if=flash_dump of=mtd4 skip=31744 bs=512
After extracting each mtd part with binwalk -e
we found the file etc/init.d/S00devs
in mtd0.
[ ... ]
mount -t squashfs /dev/mtdblock2 /mnt/app
mount -t jffs2 /dev/mtdblock3 /mnt/app_ext/
mount -t jffs2 /dev/mtdblock4 /mnt/para/
ln -s /mnt/app/mtd/boot.sh /mnt/mtd/
ln -s /mnt/app/mtd/pkg_info /mnt/mtd/
ln -s /mnt/app/mtd/resolv.conf /mnt/mtd/
ln -s /mnt/app/mtd/app/* /mnt/mtd/app/
ln -s /mnt/app/mtd_ext /mnt/
It can be seen that /dev/mtdblock2
, /dev/mtdblock3
and /dev/mtdblock4
are getting mounted on the filesystem. As we have a flash dump, we can copy the files from mtd2, mtd3, and mtd4 over so that the rootfs structure is correct.
In retrospect, commenting these lines out is a great idea. As we use
nandsim
for flash simulation, mounting these partitions on boot messes with the file structure and deletes essential files needed for emulation. Later, it can be seen that we will need totouch
orcp
these deleted files. This could have been prevented.
From /etc/init.d/S90init
it can be concluded that most of the boot procedure is happening in /mnt/mtd/boot.sh
:
|
|
/mnt/mtd/boot.sh
does some sanity checks and in the end tries to start MsgServer
and /usr/bin/watchdog
:
[ ... ]
hostname IPCamera
#if [ -f ${APP_DIR}/zbin.tar.xz ];then
# echo "The first time boot OK"
#else
#rtctool -rtctosys
MsgServer &
update_ver
killall udevd # free 464k mem memory
sleep 5
/usr/bin/watchdog &
#fi
While MsgServer
seems to run without problems, watchdog
crashes with the following error:
Load manufacturer config fail, now restore to default config
terminate called after throwing an instance of 'boost::archive::archive_exception'
what(): output stream error
With the help of ghidra
we concluded that watchdog
is responsible for starting the following binaries in /mnt/mtd/app/bin/
:
devMng
codec
webService
storage
We analyzed the binaries separately to reduce the overhead produced by watchdog
. As storage
was running without problems we began by repairing webService
.
When starting webService
the same error message can be observed. Studying the behavior of webService
shows that the binary is trying to open non-existent files:
# /medusa_utils/strace webService
open("/mnt/mtd/app/config/defConfig_model.xml", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/mnt/mtd/app/config/ProductConfig.xml", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
[ ... ]
open("/mnt/mtd/app/config/DefProductConfig.xml", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
[ ... ]
open("/mnt/mtd/app/config/ManufacturerConfig.xml", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
[ ... ]
open("/mnt/mtd/app/config/ManufacturerConfig.xml", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = -1 ENOENT (No such file or directory)
[ ... ]
open("/mnt/mtd/app/config/ManufacturerConfig.xml", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = -1 ENOENT (No such file or directory)
As we were sure that these files existed before boot, we verified whether the running filesystem differs from the one we constructed earlier. We noticed that the /mnt/mnt/app/config
, /mnt/mtd/app/config
as well as the /usr/local/hisiv300/ssl
folder were empty on the running system. Meaning that during the boot procedure, these folders are getting deleted. As we used Medusa, a chroot approach to emulate firmware samples, our solution for this problem was to copy the missing files after booting into the rootfs of the emulated device.
cp -R /para/config/* /medusa_rootfs/mnt/mtd/app/config/
cp -R /para/etc/* /medusa_rootfs/mnt/mtd/app/etc/
cp /para/openssl.cnf /medusa_rootfs/usr/local/hisiv300/ssl/openssl.cnf
After repairing the file structure, webService
started successfully.
Next, we decided to bring up codec
and devMng
. We analyzed the binaries with strace
and observed that they were accessing some missing devices in /dev
. With these created files, both of them seemed to work.
mknod -m 666 /dev/higpio c 1 5
mknod -m 666 /dev/isp_dev c 1 5
mknod -m 666 /dev/sys c 1 5
mknod -m 666 /dev/vb c 1 5
mknod -m 666 /dev/vi c 1 5
mknod -m 666 /dev/vpss c 1 5
mknod -m 666 /dev/hi_i2c c 1 5
mknod -m 666 /dev/adc c 1 5
mknod -m 666 /dev/venc c 1 5
Subsequently, we started watchdog
. After installing the flash plugin needed to communicate with the service, we were able to log in. As we didn’t have the required credentials, we deleted /mnt/para/config/UserAccountConfig.bin
to force the service into resetting the user database.
Lessons learned
As we’ve seen in the three writeups, when trying to emulate a firmware and its corresponding web service, there is no one-size-fits-all solution. In the case of LevelOne WBR-6013 N300
creating a single file was sufficient, while on the D-Link DIR-809
the main binary needed to be patched. Meaning that, for each firmware, the path to successful emulation may differ considerably.
Image Generation
For rootfs extraction and image generation from scratch, take a look at LevelOne WBR-6013 N300
and Foscam C1 HD 720P
. When compiling a kernel, make sure that nandsim
and raw socket support are included. The first one is needed when working with mtd’s while the second one can be essential for interprocess communication. Booting qemu with at least one nandsim
flash partition is recommended as this can sometimes catch unhandled exceptions, as we’ve seen in D-Link DIR-809
. Despite all this, it may occur that you will need to recompile your kernel to fit the needs of your firmware sample.
In the process of our case study, we cooperated with the people behind Medusa and therefore recommend it for the tasks above. Medusa takes an archive of the root filesystem and builds a qemu instance ready to be used on top of it. Further, multiple binaries like strace
and gdb
are already statically compiled on the system. Meaning, that a lot of laborious work has already been done for you.
To be exact, Medusa emulates firmware by generating an image from the firmware’s filesystem and chroot
into it on system start. This benefits embedded device analysis as binaries and tools that are not working on the target device due to missing library constraints, can still be used outside of the chroot
.
Analysis
For analysis, we would suggest the following path: Try understanding the boot procedure and go down the rabbit hole from there. First, after creating and starting the image, make sure to keep a copy of the bootlog. As we’ve seen in D-Link DIR-809
the “assert” string and the chip name could be found in that log. Next, try to find out how the webserver gets started. Most of the time, you will find embedded systems running SysVinit. In that case, read the files found in /etc/init*.
.
When you have a cgi or binary that handles web communication, use strace
on the binary to find out if there are any missing files that the service needs. Often XML
files, which are used for the initial configuration, are required. However, as that data is usually stored in mtd partitions, they can appear to be missing. In that case, touch
can sometimes result in successful emulation. When actual data is required, try patching the binary with the help of ghidra
or similar tools.
Also, check if there are any socket communication syscalls in the output of strace
. Depending on the options used, they may give a hint if interprocess communication is used. Here, often another “helper” or “watchdog” binary may be needed. For examples, see Foscam C1 HD 720P
or D-Link DIR-809
.
In addition, try executing every non-standard binary you find. Sometimes these devices have debugging binaries laying around in case something needs to be troubleshot by the developers. These binaries can be used to gain information on the underlying architecture (for example, by printing logs). In the case of Foscam C1 HD 720P
, we used /mnt/mtd/app/bin/debugCat
to enable the logging left by the developers. In hindsight, while it wasn’t essential for the emulation success, it helped us greatly in understanding the behavior of the firmware.
Networking
In terms of networking, the configuration used will always depend on your qemu
setup. However, some things that stood out while we were studying these firmware samples were that some firmware might act unusual when they have no network interface configured. Also, in the case of Foscam C1 HD 720P
the firmware behaved slightly different when it was able to issue DNS requests. Therefore testing and analyzing the firmware under various conditions is recommended.