This post covers my solution to the Atredis BlackHat 2018 challenge, for which I won second place and a ticket to BlackHat. I'd like to express my gratitude to the author, the increasingly-reclusive Dionysus Blazakis, as well as Atredis for running the contest.Initial Recon
As you can see from the screenshot in the tweet linked above, and reproduced below, once you connect to the network service on arkos.atredis.com:4444, the system prints information about the firmware and hardware version, as well as the memory map. Following that is the message-of-the-day, which contains a reference to 8-bit computing.
It also prints the contents of an email, which makes reference to an attachment having been written to disk. It seems like the immediate goal of the challenge is clear: retrieve the attachment from disk.
At about 8:30PM, I logged into the system for the first time. The challenge had been released at 2PM and I figured many people would have already solved it. Not knowing what to do at the prompt, I typed "help". The system replied by printing information about two commands: "help" and "read", with an example of how to use "read":'read F400' reads the byte at $F400
I ran that command and it came back with the value 31, or '1' in ASCII. I ran the command for a few adjacent addresses and this location seemed to contain the hardware version string "1.0.3 rev A", part of what was printed initially in the first messages upon connecting to the service.Dumping the Firmware
At first blush, there wasn't much more to do than read bytes off of the system at specified addresses. However, this challenge was not my first rodeo in the embedded reverse engineering space. I was immediately reminded of an exploit I once wrote for a SOHO router, where, once I obtained its memory map, I used an arbitrary read vulnerability to dump its memory contents so I could further analyze the software in IDA. I decided to do something very similar here, minus the need for an arbitrary read vulnerability.
Although I don't like Python much as a programming language, I do have to credit it for having an absurdly large standard library. In particular, while previously writing the aforementioned exploit, I made use of Python's Telnetlib module to automate interaction with the router. Nothing seemed to be stopping me from doing the same thing in this situation, so I spent about 10 minutes writing a 30-or so line Python script to log into the device, repeatedly send "read" commands, parse the resulting output, and save the results to a binary file. That functionality combined with the memory map printed by the device upon connection was all that was needed. You can find the dumped memory as .bin files here.
My script took nearly four hours to dump the memory. I don't know how much of that was related to the crappy Wi-Fi at my hotel, and how much had to do with other contestants hammering the server. Nevertheless, by the time I had the memory dump, it was 12:15AM and I had a business engagement in the morning. I needed to finish quickly so I could sleep.Inspecting the Firmware
I began by briefly inspecting the dumped memory contents in a hex editor. The three regions which, in total, encompassed the range 0x0000-0x1FFF, were entirely zero. Only a few bytes within 0xF000-0xFEFF were non-zero. In the range 0xFF00-0xFFFF, only the final three words were non-zero.
The most interesting dumped memory range was 0x4000-0xEFFF. It began with the strings that the server printed upon connection, as well as others that had not been printed. The most interesting strings were "write" and "call", which seem like they might have been commands that the user could execute on the system. After these strings, the memory dump had zeroes until address 0x8000.
At address 0x8000, there were 0x542 bytes of unknown binary data, with the remainder of the region being zeroes. Now that I've inspected the entire memory space, if this thing has any code in it at all, it must be the bytes at 0x8000-0x8542. The only other non-zero, unknown data is the few sporadic, isolated bytes previously mentioned in the range of 0xF000-0xFFF.
I connected to the system again and tried executing the "call" command I had discovered in the strings. I provided it the address 8000, which seemed to be the beginning of the code in the memory image. The thing printed:JSR $8000
Apart from that, nothing happened. Next, I executed "call 7FFF" and the system reset. I took that as a positive sign.Determining the Architecture
At this point, I did not know what architecture I was dealing with. I had two hints. First, the message-of-the-day string made reference to 8-bit computing. Second, the "call" command had printed the string "JSR", which on some architectures is a mnemonic for "jump to subroutine" (i.e., the same functionality known as "call" on x86). The best logical guess right now is that we are dealing with an 8-bit architecture where the mnemonic for the call instruction is "JSR".
I wish I could tell you I used a systematic procedure in searching for such an architecture, but I would be lying. In retrospect, I could have loaded %IDASDK%\include\allins.hpp and searched for "jsr". The string occurs 32 times in that file, which would have given me a pile of possibilities:Angstrem KR1878DEC ALPHADEC PDP-11Freescale M68HC12 or M68HC16 or HCS12 Java bytecodeMitsubishi 740 or 7700 or 7900MOS M65 [e.g. 6502] or M65816Motorola 6809 or 68000Motorola DSP56000Panasonic MN102Renesas H8, H8500, SuperH, or M16CRockwell C39
Instead what I ended up doing was searching Google for things like "assembly jsr". For any architecture suggested by one of my search results, I loaded the .bin file for the memory at 0x4000, changed the processor type to the one I was testing, and tried to disassemble the file using the relevant processor module. I ran through a few theories, among them System/360 and 68K. For each of my theories, IDA either refused to disassemble most of the code, or the resulting disassembly was gibberish. For example, when loaded as 68000, IDA refused to disassemble much of the code:
I almost gave up after about 15 minutes, since I had to be somewhere in the morning. But, a final Google search for "8-bit assembly jsr" brought me to a Wikibooks page on 6502 assembly language, which also has a "jsr" instruction. I loaded the binary as 6502, and lo and behold, the resulting disassembly listing looked legitimate. There were, for example, several loads followed by compares, followed by relatively-encoded short jumps to the same nearby location; loads of 0 or 1 immediately before a RTS (return) instruction, and so on.
It looked like I had found my winner.Loading the Binary Properly
Now that I seemed to know the architecture, I needed to load all of the memory dumps into a single IDA database for analysis. This is easy if you know what you're doing. If you do, you may as well skip this section. If you don't know how to do this, read on.
First, I loaded the file for memory range 0x4000-0xEFFF into IDA and changed the processor type to M6502.
Next I had to change the loading offset and the region that IDA considered as containing the ROM.
From there, I loaded each of the remaining memory dumps as additional binary files.
For each of those, I had to change their loading offset so they'd be represented at the proper location within the database.
That's all; now the IDB has all of the memory ranges properly loaded. The clean IDB can be found here.Reverse Engineering the Binary
I only have a few notes on the process of statically reverse engineering this particular binary. That's because, for the most part, I used the same process as I would while statically reverse engineering any binary on any architecture. In particular, I followed the same procedure that I teach in my static reverse engineering training classes. The whole process took 30-40 minutes. You can find the final IDB here.Reset Vector
It's often useful to know where execution begins when a system or program executes. For embedded devices, the "system entrypoint" is often held as a pointer in the system's reset vector. For example, on 16-bit x86, the memory location F000:FFF0 contains the address to which the processor should branch upon reset.
For this binary, I noticed that the final three words in the memory dump -- at addresses 0xFFFA, 0xFFFC, and 0xFFFE -- all contained the value 0x8530, which happens to be near the end of the region with the code in it. It seems likely that this is our entrypoint. I renamed this location BOOTLOC and began reading.6502 Documentation and Auto Comments
Now my job is to analyze and document 1300 bytes of 6502 code. A slight problem is that I don't know anything in particular about 6502 beyond some tutorials I read years ago. Nevertheless, this was not a major impediment to reverse engineering this binary. Two things helped. First, the Wikibooks web page I had found previously (which had tipped me off that the binary might be 6502) was a reasonably good layman's guide to the instruction set. Whenever I wanted to know more about a particular mnemonic, I searched for it on that page. Most of my questions were answered immediately; a few required me to correlate information from a few different parts within the document. It was good enough that I didn't have to consult any outside sources.
The second feature that helped me was IDA's "auto comments" feature, under Options->General->Auto Comments.
Once enabled, IDA leaves brief one-line descriptions of the mnemonics in the same color that it uses for repeating comments. You can see an example in the screenshot below. Although the comments may confuse you if you don't know 6502, the Wikibooks link above was enough to fill in the missing details, at which point the auto comments assisted in rapidly disassembling the binary.Disjointed References
One slightly abnormal feature of this binary -- probably 6502 binaries in general -- is the fact that the instructions usually operate on 8-bit values, but the memory addresses are 16-bits. Thus, when the code needs to write a value into memory that is more than 8 bits, it does so one byte at a time. The following screenshot illustrates:
Lines 0x817D-0x8183 write the constant 0x405F into memory (at location 0x80) as a little-endian word. Lines 0x8188-0x818E writes the constant 0x406C to the same location. Lines 0x8193-0x8199 write the constant 0x4081 to the same location.
Those constants stuck out to me as being in the area where the strings are located. Indeed, here's the view of that location:
Thus, the addresses written by the code previously are those of the first three strings making up the textual description of the memory map that is printed when connecting to the system. (The string references also tip us off that the called function likely prints strings.)
Although splitting larger constants into smaller ones is not unique to M6502 -- for example, compiled binaries for most processors with fixed-width instructions do this -- nevertheless, as far as I know, IDA's support for recognizing and recombining these constructs into macroinstructions must be implemented per processor (as part of the processor module). Evidently, the 6502 processor module does not support fusing writes into pseudo-operations. Therefore, we have to resolve these references manually while reverse engineering. Nevertheless, this is not difficult (and we only have about 1300 bytes of code to analyze). Simply use the "G" keyboard shortcut to jump to the referenced locations when you see them. For strings, just copy and paste the string at the referencing location.Memory-Mapped I/O
When the system prints its memory map upon startup, the region from 0xF000-0xFF00 is labeled "MMIO", which presumably is short for memory-mapped I/O. Memory-mapped I/O addresses are effectively global variables stored in memory. When you write to a memory-mapped I/O location, the data is accessible to other components such as peripherals (i.e., a screen controller or a disk controller). Similarly, peripherals can also share data with the 6502 component via memory-mapped I/O, such as keyboard input or the results of disk I/O.
Reverse engineering the variables in this range is effectively the same as reverse engineering accesses to any global variable, although you must keep in mind that MMIO output locations might never be read, and MMIO input locations might never be written.System Description
Once familiar with the basics of 6502, and the particular points described above, I statically reverse engineered the binary's scant 25 functions.Upon boot, the system prints the diagnostic messages we see in the screenshot at the top of this post. Next, it loads the message-of-the-day off of the file system (from the file /etc/motd) and prints it. Next, it checks mail (under the file /var/mail/spool/atredis) and prints it. Finally, it enters into an infinite loop processing user input and dispatching commands.
Reading the code for the final step -- command dispatch -- we can see indeed that "write" and "call" are valid, undocumented commands. The "write" command is ultimately very straightforward: it converts the address and value arguments into hexadecimal, and then write the value to the address. The "call" command is also straightforward, but I found it neat. It creates 6502 code at runtime for a JSR and RTS instruction, which it then invokes. My daily work does not usually involve examining JIT code generation on archaic platforms.File I/O
It's been a while since we mentioned it, but recall from the introduction that when we connected to the system, it printed the most recent mail, and dropped a hint about the attachment having been written to disk. Our ultimate goal in reverse engineering this binary is to find and exfiltrate this mystery file from the disk. The message did not actually give us a filename, or anything of that sort. Let's take a closer look at how the system loads and prints files off of the disk, such as the message-of-the-day and the most recent email.
The full details can be found in the IDB, but the process generally works like this:PrintMOTD() writes a pointer to "/etc/motd" into a global variable X, then calls Load()Load() reads, in a loop, each of the 0x40 disk sectors into a global array Y and compares the first bytes of Y against XOnce found, Load() returns 1. The contents of the disk sector are still in Y.If Load() succeeded, PrintMOTD() calls PrintDiskSectors().At this point, the global buffer Y contains not only the name of the file, but also two words; let's call them Z and W. Z indicates the number of the first disk sector which contains the file's contents, and W is the number of sectors that the file occupies.PrintDiskSectors() then consults Z and W from within the global array. Beginning at sector Z, it prints the raw contents of the file onto the screen, and repeats for W sectors.
(My analysis at the time was slightly sloppier than the above. I did not fully understand the role of the values I've called Z and W.)Enumerating the Disk Contents, Finishing the Challenge
I now had a rough understanding of the mechanism of how files were read from disk and printed to the screen. My understanding also indicated that I could dump the underlying sectors of the disk without needing to know the names of the files comprising those sectors.
In particular: PrintDiskSectors(), at address 0x822C, reads its two arguments from global memory. Those arguments are: 1) Z, i.e., which sector to begin dumping data from, and 2) W, i.e., how many sectors to dump.
And so it was immediately obvious that I could use the undocumented "write" and "call" commands to write whatever I wanted into Z and W and then invoke PrintDiskSectors(). I tried it out in netcat and it worked on the first try -- I was able to dump sector #0.
Thus, I incorporated this functionality into my scripts based on Python's Telnetlib that I had previously used to dump the memory. Since my understanding at the time was a bit off, I ended up with a loop that executed 0x40 times (the number of sectors), which wrote 0x0000 in for Z (i.e., start reading at sector 0), and each iteration of the loop wrote an increasing value into W, starting with 1 and ending with 0x3F. The script would dump the returned data as a binary file, as well as printing it onto the screen. You can find the script here, and its output here.
I let my script run and once it got to the final iteration of the loop, there was a message congratulating me and telling me where to send an email, as well at what the contents should be. I immediately sent an email at 1:35AM, about 80 minutes after I'd first dumped the memory off of the device. Shortly thereafter, I received a personalized email congratulating me on my second-place finish.
Pokémon-themed Umbreon Linux Rootkit Hits x86, ARM Systems
Research: Trend Micro
There are two packages
one is 'found in the wild' full and a set of hashes from Trend Micro (all but one file are already in the full package)
Download Email me if you need the password
Part one (full package)
#File NameHash ValueFile Size (on Disk)Duplicate?1.umbreon-ascii0B880E0F447CD5B6A8D295EFE40AFA376085 bytes (5.94 KiB)2autoroot1C5FAEEC3D8C50FAC589CD0ADD0765C7281 bytes (281 bytes)3CHANGELOGA1502129706BA19667F128B44D19DC3C11 bytes (11 bytes)4cli.shC846143BDA087783B3DC6C244C2707DC5682 bytes (5.55 KiB)5hideportsD41D8CD98F00B204E9800998ECF8427E0 bytes ( bytes)Yes, of file promptlog6install.sh9DE30162E7A8F0279E19C2C30280FFF85634 bytes (5.5 KiB)7Makefile0F5B1E70ADC867DD3A22CA62644007E5797 bytes (797 bytes)8portchecker006D162A0D0AA294C85214963A3D3145113 bytes (113 bytes)9promptlogD41D8CD98F00B204E9800998ECF8427E0 bytes ( bytes)10readlink.c42FC7D7E2F9147AB3C18B0C4316AD3D81357 bytes (1.33 KiB)11ReadMe.txtB7172B364BF5FB8B5C30FF528F6C51252244 bytes (2.19 KiB)12setup694FFF4D2623CA7BB8270F5124493F37332 bytes (332 bytes)13spytty.sh0AB776FA8A0FBED2EF26C9933C32E97C1011 bytes (1011 bytes)Yes, of file spytty.sh14umbreon.c91706EF9717176DBB59A0F77FE95241C1007 bytes (1007 bytes)15access.c7C0A86A27B322E63C3C29121788998B8713 bytes (713 bytes)16audit.cA2B2812C80C93C9375BFB0D7BFCEFD5B1434 bytes (1.4 KiB)17chown.cFF9B679C7AB3F57CFBBB852A13A350B22870 bytes (2.8 KiB)18config.h980DEE60956A916AFC9D2997043D4887967 bytes (967 bytes)19config.h.dist980DEE60956A916AFC9D2997043D4887967 bytes (967 bytes)Yes, of file config.h20dirs.c46B20CC7DA2BDB9ECE65E36A4F987ABC3639 bytes (3.55 KiB)21dlsym.c796DA079CC7E4BD7F6293136604DC07B4088 bytes (3.99 KiB)22exec.c1935ED453FB83A0A538224AFAAC71B214033 bytes (3.94 KiB)23getpath.h588603EF387EB617668B00EAFDAEA393183 bytes (183 bytes)24getprocname.hF5781A9E267ED849FD4D2F5F3DFB8077805 bytes (805 bytes)25includes.hF4797AE4B2D5B3B252E0456020F58E59629 bytes (629 bytes)26kill.cC4BD132FC2FFBC84EA5103ABE6DC023D555 bytes (555 bytes)27links.c898D73E1AC14DE657316F084AADA58A02274 bytes (2.22 KiB)28local-door.c76FC3E9E2758BAF48E1E9B442DB98BF8501 bytes (501 bytes)29lpcap.hEA6822B23FE02041BE506ED1A182E5CB1690 bytes (1.65 KiB)30maps.c9BCD90BEA8D9F9F6270CF2017F9974E21100 bytes (1.07 KiB)31misc.h1F9FCC5D84633931CDD77B32DB1D50D02728 bytes (2.66 KiB)32netstat.c00CF3F7E7EA92E7A954282021DD72DC41113 bytes (1.09 KiB)33open.cF7EE88A523AD2477FF8EC17C9DCD7C028594 bytes (8.39 KiB)34pam.c7A947FDC0264947B2D293E1F4D69684A2010 bytes (1.96 KiB)35pam_private.h2C60F925842CEB42FFD639E7C763C7B012480 bytes (12.19 KiB)36pam_vprompt.c017FB0F736A0BC65431A25E1A9D393FE3826 bytes (3.74 KiB)37passwd.cA0D183BBE86D05E3782B5B24E2C964132364 bytes (2.31 KiB)38pcap.cFF911CA192B111BD0D9368AFACA03C461295 bytes (1.26 KiB)39procstat.c7B14E97649CD767C256D4CD6E4F8D452398 bytes (398 bytes)40procstatus.c72ED74C03F4FAB0C1B801687BE200F063303 bytes (3.23 KiB)41readwrite.cC068ED372DEAF8E87D0133EAC0A274A82710 bytes (2.65 KiB)42rename.cC36BE9C01FEADE2EF4D5EA03BD2B3C05535 bytes (535 bytes)43setgid.c5C023259F2C244193BDA394E2C0B8313667 bytes (667 bytes)44sha256.h003D805D919B4EC621B800C6C239BAE0545 bytes (545 bytes)45socket.c348AEF06AFA259BFC4E943715DB5A00B579 bytes (579 bytes)46stat.cE510EE1F78BD349E02F47A7EB001B0E37627 bytes (7.45 KiB)47syslog.c7CD3273E09A6C08451DD598A0F18B5701497 bytes (1.46 KiB)48umbreon.hF76CAC6D564DEACFC6319FA167375BA54316 bytes (4.21 KiB)49unhide-funcs.c1A9F62B04319DA84EF71A1B091434C644729 bytes (4.62 KiB)50cryptpass.py2EA92D6EC59D85474ED7A91C8518E7EC192 bytes (192 bytes)51environment.sh70F467FE218E128258D7356B7CE328F11086 bytes (1.06 KiB)52espeon-connect.shA574C885C450FCA048E79AD6937FED2E247 bytes (247 bytes)53espeon-shell9EEF7E7E3C1BEE2F8591A088244BE0CB2167 bytes (2.12 KiB)54espeon.c499FF5CF81C2624B0C3B0B7E9C6D980D14899 bytes (14.55 KiB)55listen.sh69DA525AEA227BE9E4B8D59ACFF4D717209 bytes (209 bytes)56spytty.sh0AB776FA8A0FBED2EF26C9933C32E97C1011 bytes (1011 bytes)57ssh-hidden.shAE54F343FE974302F0D31776B72D0987127 bytes (127 bytes)58unfuck.c457B6E90C7FA42A7C46D464FBF1D68E2384 bytes (384 bytes)59unhide-self.pyB982597CEB7274617F286CA80864F499986 bytes (986 bytes)60listen.shF5BD197F34E3D0BD8EA28B182CCE7270233 bytes (233 bytes)
part 2 (those listed in the Trend Micro article)
#File NameHash ValueFile Size (on Disk)1015a84eb1d18beb310e7aeeceab8b84776078935c45924b3a10aa884a93e28acA47E38464754289C0F4A55ED7BB556489375 bytes (9.16 KiB)20751cf716ea9bc18e78eb2a82cc9ea0cac73d70a7a74c91740c95312c8a9d53aF9BA2429EAE5471ACDE820102C5B81597512 bytes (7.34 KiB)30a4d5ffb1407d409a55f1aed5c5286d4f31fe17bc99eabff64aa1498c5482a5f0AB776FA8A0FBED2EF26C9933C32E97C1011 bytes (1011 bytes)40ce8c09bb6ce433fb8b388c369d7491953cf9bb5426a7bee752150118616d8ffB982597CEB7274617F286CA80864F499986 bytes (986 bytes)5122417853c1eb1868e429cacc499ef75cfc018b87da87b1f61bff53e9b8e86709EEF7E7E3C1BEE2F8591A088244BE0CB2167 bytes (2.12 KiB)6409c90ecd56e9abcb9f290063ec7783ecbe125c321af3f8ba5dcbde6e15ac64aB4746BB5E697F23A5842ABCAED36C9146149 bytes (6 KiB)74fc4b5dab105e03f03ba3ec301bab9e2d37f17a431dee7f2e5a8dfadcca4c234D0D97899131C29B3EC9AE89A6D49A23E65160 bytes (63.63 KiB)88752d16e32a611763eee97da6528734751153ac1699c4693c84b6e9e4fb08784E7E82D29DFB1FC484ED277C70218781855564 bytes (54.26 KiB)9991179b6ba7d4aeabdf463118e4a2984276401368f4ab842ad8a5b8b730885222B1863ACDC0068ED5D50590CF792DF057664 bytes (7.48 KiB)10a378b85f8f41de164832d27ebf7006370c1fb8eda23bb09a3586ed29b5dbdddfA977F68C59040E40A822C384D1CEDEB6176 bytes (176 bytes)11aa24deb830a2b1aa694e580c5efb24f979d6c5d861b56354a6acb1ad0cf9809bDF320ED7EE6CCF9F979AEFE451877FFC26 bytes (26 bytes)12acfb014304b6f2cff00c668a9a2a3a9cbb6f24db6d074a8914dd69b43afa452584D552B5D22E40BDA23E6587B1BC532D6852 bytes (6.69 KiB)13c80d19f6f3372f4cc6e75ae1af54e8727b54b51aaf2794fedd3a1aa463140480087DD79515D37F7ADA78FF5793A42B7B11184 bytes (10.92 KiB)14e9bce46584acbf59a779d1565687964991d7033d63c06bddabcfc4375c5f1853BBEB18C0C3E038747C78FCAB3E0444E371940 bytes (70.25 KiB)