0x00 — Preface In this article series I will be going over different types of binary exploits in detail, explaining what they are, how they work, the technologies behind them, and some defenses against them. Throughout this series I will do my best to explain these attacks, defenses, technologies, and concepts in a way that anyone, from beginner to 1337 h4x0r, can understand. Please note: While I will be adding some key Prerequisite knowledge sections in hopes of making the more technical explanation of these attacks easier to understand, this article series will not go over of the information / concepts / technologies necessary to be proficient in the field of binary exploitation. all In this article, we’ll be covering: 0x01. Prerequisite Knowledge: Application Memory0x02. Prerequisite Knowledge: The Stack0x03. Prerequisite Knowledge: Function Calls and Returns0x04. Attack: Stack Buffer Overflows0x05. Attack: Return-to-libc (ret2libc) attacks Click below for the next part of this series: Binary Exploitation ELI5 — Part 2 0x01— Prerequisite Knowledge: Application Memory When executed, Applications are loaded into memory, however, as we all know, computers have a finite amount of memory and, as such, they have to be extremely careful when loading things into it so as to not overwrite any other application. To do this, computers use a concept called , which can be perfectly summed up using the scene from the early 2000s TV show, in which Drake and Josh : Virtual Memory Drake and Josh , take a job organizing sushi into containers In the scene, Drake and Josh get a job where they take sushi that is coming through a conveyor belt and they have to organize the pieces of sushi into containers. Furthermore, while all the sushi containers look exactly the same, it is crucial that each container only contains one type of sushi. So, let’s break the analogy down and relate it to the concept of Virtual Memory: As I said above, computers have to be very careful and precise about where they put application data in memory so that nothing is overwritten. Although a computer just simply carefully put applications in physical memory, this would eventually cause problems, as application fragments would quickly fill up the entire space. In the above example, the individual sushi pieces can be seen as application fragments or chunks of memory allocated by the application, while the entire set of sushi (6 per container) can be seen as the application itself. The Sushi Conveyor Belt: could To circumvent the issue of filling up the conveyor belt with individual pieces of sushi, Drake and Josh organized them into individual containers, which were then allowed to move down the conveyor belt. Much like Drake and Josh, your computer organizes and sets applications into containers as well, called virtual memory locations. These virtual memory locations (or ) allow the application to believe it has full control over the entire scope of memory. However, when an application calls a location or tries to allocate memory within it’s Virtual Address Space instead of being granted access to arbitrary physical memory, a small, but extremely important, piece of hardware in your computer’s CPU (Central Processing Unit) called the MMU (Memory Management Unit) maps the application’s call with a specific region of physical memory, and facilitates any memory manipulation. This memory mapping allows computers to organize and process multiple applications with dynamic memory requirements through a centrally organized lookup table. Drake and Josh: Virtual Address Spaces An ASCII Diagram of the Virtual Memory Process It is also important to note that while all of an application’s code is contained within it’s virtual address space, applications often use dynamically linked libraries (DLL) such as or These DLLs are simply (not stored within the Application’s address space) system applications or other custom applications that the program imports code from. Take the below code for example: libc kernel32. external A Basic C function As you can see, nowhere in this 6 line program do I actually define what is. However, this program will still run without issue and print out “Hello World”. This is because the function is a system function defined in which is the standard C library. During the compiling process, is externally linked to the executable. On a linux system, you can view a program’s shared library dependencies using the command. printf printf libc, libc ldd Displaying a program’s shared library dependencies with ldd If you’re looking at the above screenshot and wondering well, that’s the address of the library in memory. Memory addresses are represented in hexadecimal format. Please . what in the world is 0xb7e99000 libc click here to get some more information on the hexadecimal number system 0x02— Prerequisite Knowledge: The Stack The Stack is simply a large data structure that is used to store application information and data during runtime. The stack’s functionality can be simply explained through the following analogy: Bob is a dishwasher at a fancy restaurant, each night Bob has a stack of plates to wash. Furthermore, throughout the night more plates may be added to Bob’s stack whenever a table is cleared off. If Bob takes a plate from anywhere but the top of the stack, all the ones above it will fall and break. Now, instead of Bob and a stack of plates simply imagine a computer and a stack of Data Objects. Whenever something is onto the stack, it is added to the top of the stack, and whenever something is off the stack, it is removed from the top of the stack, making it a ast n irst ut (LIFO) mechanism. pushed popped L I F O The stack is used by programs to hold all sorts of things such as function pointers (the location of a function in memory), and variables. 0x03 — Prerequisite Knowledge: Function calls and Returns Take a look at the below code: A basic C program In this code snippet, we see that the function takes 2 integer type arguments called and . In the function, we can see that we’ve called with the number 1 for the argument and the number 2 for the argument . If we break this code down into it’s underlying machine code we see: add A B main Add A B Calling the add function with 2 arguments As you can see, when calling a function with parameters the program first pushes both parameters onto the stack and then executes a statement. This statement redirect’s the programs instruction pointer (An instruction pointer is like the little pencil you use to keep track of which word you’re reading. The instruction pointer always points to the instruction that’s about to be executed (the word that’s about to be read)) to the address of the function being called. However, before navigating to the called function, the statement pushes the address of the next instruction below it to the stack, so that when the function returns, it will know where to continue processing from. The address of the location that the function should return to is called the functions . call call call add return pointer 0x04 — Attack: Stack Buffer Overflows Before going into technical detail about what Stack Buffer Overflows are and how they work, let’s look at a quick, easy-to-understand, analogy: Alice and Bob used to date, but Alice ended up breaking up with Bob. As time went on, Alice moved on but Bob never really got over the heartbreak. Now, Alice is getting married to Robert Hackerman, Bob’s arch-nemesis. Bob, being a creepy weirdo, spied on all of Alice’s wedding plans through his secret access to Alice’s email account. Bob saw that Alice hired a famous wedding cake designer who would wanted Alice to edit parts of his recipe for her flavor preferences. The designer gave Alice a recommended list of ingredients to add but said he would do whatever she wanted, precisely. Bob opened up the document attached to the designer’s email and saw that the recipe’s custom lines looked like: … Then, we’ll add flavor to the frosting by adding ______. After that, we’ll add some chocolate …. Bob noticed that if you entered “Banana” into the line, the text would look like: … Then, we’ll add flavor to the frosting by adding banana. After that, we’ll add some chocolate … But, if Bob entered “Strawberry” into the line, the text would look like: … Then, we’ll add flavor to the frosting by adding strawberryter that, we’ll add some chocolate … Bob realized that this would be the perfect way to ruin Alice’s wedding, all he had to do was overwrite the rest of the recipe with his own, disgusting, version! On Alice’s wedding day, the designer finally revealed the cake he had made — It was covered in bugs and made out of frozen mayonnaise! A stack buffer overflow, much like Bob’s attack, overwrites data that the developer didn’t intend to have overwritten, allowing for full control over the program and its output(s). So, now let’s see it in the real world. Take a look at the following piece of code from : exploit-exercises.com Exploit-Exercises.com Protostar Stack0 Code In the above function, we see that a character type array called is created with a size of 64. Then, we see that the variable is set to 0 and the function is called with the variable as an argument. Finally, we see an IF statement that checks if modified is not 0. Obviously no where in this application is the variable set to anything other than 0 so how are we going to change it? buffer modified gets buffer modified Well, let’s first take a look at the function documentation: gets gets function defined gets function bugs section As you can see, the function simply takes in user input. However, the function does not check if the user input actually fits into the data structure we’re storing it in (in this case, ) and thus, we’re able to overflow the data structure and affect other variables / data on the stack. Furthermore, since we know that all variables are stored on the stack, and we know what the modified variable is (0), all we have to do is enter enough input to overwrite the modified variable. Let’s take a look at a diagram: gets buffer an ASCII diagram of a stack buffer overflow As you can see, if a malicious user simply enters too much text they can overwrite the modified variable and anything else on the stack, including return pointers. This means that if a malicious agent is able to take control of a programs stack, they are effectively able to take control of the entire program and make it do whatever they want. They could simply overwrite a function’s return pointer on the stack to a custom one that points at some malicious payload. 0x05 — Attack: ret2libc Before we talk about Return-to-libc (ret2libc) attacks, let’s take a moment to discuss a little bit deeper. libc As we know (from section 0x01), is the standard C library. This means that it contains all the generic system functions included in the C programming language. Now, what if a malicious user was able to take control of the program to execute some of these functions? libc Well, that’s pretty much exactly what ret2libc is. One perfect analogy for ret2libc’s consequences could be . Think back to the classic “Guns, lots of guns” scene. Tank, the operator, was able to completely bypass and reprogram the matrix to make A TON of guns just appear out of nowhere. the Matrix series You can sort of think of return to libc like that, we’re able to take control of the matrix (the standard C library) and make it do whatever we want. At it’s base, ret2libc attacks are actually based on stack buffer overflows. Think back to what I said at the end of section 0x04, If a malicious agent can overwrite data on the stack, they can simply overwrite the return pointer to point to a specific function within libc and pass it whatever arguments necessary to deliver a payload. One of the most common functions to use for ret2libc attacks is the function. Let’s take a look at it’s documentation: system the system command’s documentation As you can see, the system command simply executes shell commands (the is the linux command line). Furthermore, If we read into the description we can see that system simply executes is the actual shell command) and the command is passed into the function through an argument. shell /bin/sh -c <command> (/bin/sh So, all we have to do to gain to the that the vulnerable application is running on is push “/bin/sh” onto the stack as an argument then replace a return or call pointer with the memory address of the system function so that the function is called with /bin/sh as an argument, starting up a shell and granting us complete access over the system. command-line access machine Exploits, lots of exploits. 0x06 — Part 1 Conclusion In this article we covered: 0x01. Virtual memory and how applications are processed in memory0x02. Dynamically Linked Libraries and libc0x03. The Stack0x04. How functions are called and how returning from a function works0x05. Stack Buffer Overflows0x06. Return-to-libc (ret2libc) attacks I hope this article was helpful. Click below to continue on to Part 2 of this series: Binary Exploitation ELI5 — Part 2 Also, if you’re interested in reverse engineering, please check out my article series: BOLO: Reverse Engineering BOLO: Reverse Engineering — Part 1 (Basic Programming Concepts) BOLO: Reverse Engineering — Part 2 (Advanced Programming Concepts) And, if you’re looking for more ELI5 content, check out my article. Explain Spectre and Meltdown Like I’m 5 push “Thanks”push “for”call Reading