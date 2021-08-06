## Introduction\n\nI have been working with C all of my professional and student life. There have been times when I had to look a little deeper to understand what is going on with my buggy program.\n\n\\\nDuring my experience, I have learned that there are many tools and techniques that one can use to examine an executable and this post is about that. Furthermore, I will cover a bit about how to employ some reverse engineering practices.\n\n## Code for the Purpose of Testing\n\nAs an example, I wrote a pretty basic piece of code with some intentional inclusions.\n\n\\\nThere are two global variables `msga` and `msgb`.\n\n\\\nTwo user-defined routines `allow` and `deny` get executed inside the `main` function.\n\n\\\nOne conditional call to an external program is executed using `execvp`.\n\n\\\nThe idea here is to examine the executable this program creates. Find out where the code I wrote lands in the executable and what compiler adds on top of it.\n\n\\\nLater I'll showcase some basic reverse engineering that can be done by pretending we haven't seen the code.\n\n\\\n[GitHub](https://github.com/ranuzz/makeall-code/blob/main/examinebin/examinebin.c)\n\n\\\n```c\n#include <stdio.h>\n#include <unistd.h>\n\nchar *msga = "Allow";\nchar *msgb = "Deny";\n\nvoid allow() {\n printf("%s\\n", msga);\n}\n\nvoid deny() {\n printf("%s\\n", msgb);\n}\n\nint main(int argc, char **argv) {\n deny();\n int runExternal = 0;\n if (runExternal) {\n char* lsargs[] = {"ls", "-l", NULL};\n execvp("ls", lsargs);\n }\n}\n```\n\n\\\nWhile dealing with executable codes we'll encounter a lot of hexadecimal values-- I prefer using Python to do quick arithmetic whenever the need arises.\n\n\\\nAlso, some of the output is going to be too big to paste here in the post so I'll link them in the end.\n\n## Goal\n\nLet's compile the program and get our `a.out`.\n\n\\\n```shell\ngcc examinebin.c\n```\n\n\\\nAs we see in the code the execution of the program basically runs the `deny` and halts.\n\n\\\nThe goal I am creating for myself is that I'll identify the instruction inside the executable and change it to make sure that `allow` is called and then 'ls`is executed with`-a\\` argument.\n\n\\\n\n:::info\n## **PS: The tools used to do analysis and their output are listed at the bottom of this post for reference.**\n\n:::\n\n## Call Sequence\n\n\\\nIf we look at the `objdump` [output](https://github.com/ranuzz/makeall-code/blob/main/examinebin/objdump.txt), it is very neatly divided into segments and clearly labeled with symbol names. The code we are interested in is the one we wrote but it's nice to know what everything else is.\n\n\\\nThe short version is every C program needs a `main` routing that marks the start and end of user written code.\n\n\\\nC runtime executes `main` within its framework and takes care of all static and runtime dependencies.\n\n\\\nThe order of execution can be determined very easily by hooking up the executable with `gdb` and adding a breakpoint to all symbols defined in `.text` section and `_init` & `_fini`.\n\n\\\nLet's see what happens.\n\n\\\n> breakpoint : _init, _start, deregister_tm_clones, register_tm_clones, __do_global_dtors_aux, frame_dummy, allow, deny, main, __libc_csu_init, __libc_csu_fini, _fini\n\n\\\nBelow is the call sequence labelled by me based on my understanding of the usual meaning of these symbols:\n\n\\\n```c\n// Initialisation\n_init (argc=1, argv=0x7fffffffdfd8, envp=0x7fffffffdfe8)\n_start ()\n__libc_csu_init ()\n_init ()\nframe_dummy ()\nregister_tm_clones ()\n\n// User Code\nmain ()\ndeny () // we want to call allow and execvp here instead\n\n\n// Deconstruction and finalisation\n__do_global_dtors_aux ()\nderegister_tm_clones ()\nderegister_tm_clones ()\n_fini ()\n```\n\n## Identification\n\nNow that we know what we don't have to explore we can focus on the task at hand, calling to allow and `ls` with `-a`.\n\n\\\nTo do that we will specify our goal properly, basically, we want to:\n\n\\\n* Call allow instead of deny.\n\n \\\n* Change `runExternal` flag value to non-zero.\n\n \\\n* Change `"-l"` to `"-a" in `lsargs\\`\n\n \\\n\nTo do that we have to know where these values are in binary and then change them manually without disturbing everything else.\n\n## Replace `deny`\n\nThe hexadecimal code calling `deny` from `objdump` output\n\n\\\n```txt\n0000000000001189 <allow>:\n\n00000000000011a3 <deny>:\n\n00000000000011bd <main>:\n 11e4: e8 ba ff ff ff callq 11a3 <deny>\n 11e9: c7 45 dc 00 00 00 00 movl $0x0,-0x24(%rbp)\n```\n\n\\\nFrom the [callq](https://www.felixcloutier.com/x86/call) reference, we know that opcode `e8` takes the operand `ba ff ff ff` (0xffffffba) which is basically the offset from the next instruction `0x11e9`. So, it should point to (0x11a3).\n\n\\\n```python\noffset = hex(0xffffffba - 0x100000000) # getting the negative value\ndeny_addr = hex(0x11e9 + int(offset, 16))\nprint(deny_addr)\n```\n\n\\\nTo call `allow` instead, we will have to change (0xffffffba) to something that gives (0x1189) instead.\n\n\\\n```python\nallow_addr = hex(0x1189)\noffset = hex(int(hex(int(allow_addr, 16) - 0x11e9), 16) + 0x100000000)\nprint(offset)\n# 0xffffffa0 -> a0 ff ff ff\n```\n\n\\\n**So all we need to do is change** `ba` to `a0` in the binary.\n\n## Change `runExternal`\n\nThis is quite simple, all we need to do is locate the `mov` instruction that is putting the value in the flag.\n\n\\\n```txt\n 11e9: c7 45 dc 00 00 00 00 movl $0x0,-0x24(%rbp)\n```\n\n\\\nThen change the value to any non-zero one. [ref](https://www.felixcloutier.com/x86/mov)\n\n\\\n**00 00 00 00 -> 01 00 00 00**\n\n## Change `"-l"`\n\nWe basically want to change the arguments going into `execvp` function call.\n\n\\\nIn the assembly, we can see the location where the `callq` to `execvp` has been made and there should be `push` or `lea` instruction before that to add the argument into the stack.\n\n\\\nSince these values are hardcoded in binaries all we need to do is get the location of `-l` and change it to `-a`.\n\n\\\n```txt\n 11f6: 48 8d 05 12 0e 00 00 lea 0xe12(%rip),%rax # 200f <_IO_stdin_used+0xf>\n 11fd: 48 89 45 e0 mov %rax,-0x20(%rbp)\n 1201: 48 8d 05 0a 0e 00 00 lea 0xe0a(%rip),%rax # 2012 <_IO_stdin_used+0x12>\n 1208: 48 89 45 e8 mov %rax,-0x18(%rbp)\n 121b: 48 8d 3d ed 0d 00 00 lea 0xded(%rip),%rdi # 200f <_IO_stdin_used+0xf>\n 1222: e8 69 fe ff ff callq 1090 <execvp@plt>\n```\n\n\\\nThe [lea](https://www.felixcloutier.com/x86/lea) instruction is basically calculating the effective address which in every case here is an offset to the next instruction pointer.\n\n\\\nSo we have three addresses, which can be calculated or seen in the `objdump` output as well.\n\n\\\n```python\nprint(hex(0xe12 + 0x11fd)) # 0x200f\nprint(hex(0xe0a + 0x1208)) # 0x2012\nprint(hex(0xded + 0x1222)) # 0x200f\n```\n\n\\\nFrom the `hexdump` output we can clearly see that our strings are really there.\n\n\\\n```txt\n00002000: 0100 0200 416c 6c6f 7700 4465 6e79 006c ....Allow.Deny.l\n00002010: 7300 2d6c 0000 0000 011b 033b 5400 0000 s.-l.......;T...\n```\n\n\\\nChanging the fourth byte from the right **6c -> 61** will make `l->a`.\n\n## Changes\n\nLet's summarize and do all the necessary changes to the text output provided by `xxd` utility.\n\n\\\n*Changes for* `allow`\n\n```\n000011e0: 0000 0000 e8(ba) ffff ffc7 45dc 0000 0000 \n000011e0: 0000 0000 e8(a0) ffff ffc7 45dc 0000 0000 \n```\n\n\\\n*Changes fo*r `runExternal` *flag*\n\n```\n000011e0: 0000 0000 e8ba ffff ffc7 45dc (00)00 0000 \n000011e0: 0000 0000 e8ba ffff ffc7 45dc (01)00 0000 \n```\n\n\\\n*Changes for* `l -> a`\n\n```\n00002010: 7300 2d(6c) 0000 0000 011b 033b 5400 0000 \n00002010: 7300 2d(61) 0000 0000 011b 033b 5400 0000 \n```\n\n## Create a new Executable\n\n*Using* `xxd` *Utility*\n\n```shell\nxxd -r modified-xxd.txt > a2.out\n```\n\n*Change Permission*\n\n```shell\nchmod +x a2.out\n```\n\n## Run\n\nI can tell you that it actually works but it's better to try yourself. The output is:

### Before

```shell
Deny
```

### After

```shell
Allow
. .. a.out a2.out
``` Once we do some reverse engineering this output will tell us that the new executable is not genuine.*\n\n\\\n*[output](https://github.com/ranuzz/makeall-code/blob/main/examinebin/sum.txt)*\n\n\\\n# `ldd`\n\n*Gives the list of shared objects required by the executable.*\n\n\\\n*[output](https://github.com/ranuzz/makeall-code/blob/main/examinebin/ldd.txt)*\n\n\\\n**There are some utilities that give a quick peek about the executable if an in-depth examination is not something you need.**\n\n# `strings`\n\n*Displays all printable characters and strings in the file. Works on any file, not just executable.*

*[output](https://github.com/ranuzz/makeall-code/blob/main/examinebin/strings.txt)*

# `nm`

*Lists all the symbols present in the executable file address map.*

*[output](https://github.com/ranuzz/makeall-code/blob/main/examinebin/nm.txt)*

*Now comes the in-depth analysis of executable, this includes interpreting the machine code into human-readable form and also figuring out a way to edit the file.*

# `objdump`

*Using the* `-d` *option you can get the detailed version of each section and segment of your executable along with the interpreted assembly instruction.*

*[output](https://github.com/ranuzz/makeall-code/blob/main/examinebin/objdump.txt)*

### `xxd` or `hexdump`

*These are plain read-write tools to deal with binary files and not just executables.*

*The reading part creates a text file showing hexadecimal values at each byte and if possible there is a printable version side by side.*

*Any changes to this output text file can be fed back to the tool, which can then create a binary file.*

*I am using* `xxd` *for reading and writing the executable here.*

*[output](https://github.com/ranuzz/makeall-code/blob/main/examinebin/xxd.txt)*