In an earlier article, we have seen C runtime: before starting main & How C program stored in RAM memory. Here we will see "How C program converts into the assembly?" and different aspect of its working at the machine level.
/!\: Originally published @ www.vishalchovatiya.com.
General Purpose Registers
Pointer Register
Segment Register
Index Registers
Apart from all these, there are many other registers as well which even I don't know about. But above-mentioned registers are sufficient to understand the subsequent topics.
We will consider the following example with its disassembly inlined to understand its different aspect of its working at machine level :
We will focus on a stack frame of the function
func()
. But before analysing stack frame of it, we will see how the calling of function happensFunction calling is done by call instruction(see Line 15) which is subroutine instruction equivalent to:
push rip + 1 ; return address is address of next instructions
jmp func
Here,
call
store the rip+1
(not that +1 is just for simplicity, technically this will be substituted by the size of instruction) in the stack which is return address once call to func()
ends. A function stack frame is divided into three parts
1. Prologue/Entry: As you can see instructions(line 2 to 4) generated against start bracket
{
is prologue which is setting up the stack frame for func()
, Line 2 is pushing the previous frame pointer into the stack & Line 3 is updating the current frame pointer with stack end which is going to be a new frame start.push
is basically equivalent to :sub esp, 4 ; decrements ESP by 4 which is kind of space allocation
mov [esp], X ; put new stack item value X in
Argument of
func()
is stored in edi
register on Line 14 before calling call
instruction. If there is more argument then it will be stored in a subsequent register or stack & address will be used. Line 4 in
func()
is reserving space by pulling frame pointer(pointed by rbp
register) down by 4 bytes for the parameter arg
as it is of type int
. Then mov
instruction will initialize it with value store in edi
. This is how parameters are passed & stored in the current stack frame. ---|-------------------------|--- main()
| |
| |
| |
|-------------------------|
| main frame pointer |
rbp & rsp ---|-------------------------|--- func()
in func() | arg |
|-------------------------|
| a |
|-------------------------| stack
| + | |
| + | |
| + | |
---|-------------------------|--- \|/
| |
| |
2. User code: Line 5 is reserving space for a local variable
a
, again by pulling frame pointer further down by 4 bytes. mov
instruction will initialize that memory with a value 5
.g
is addressed directly with its absolute addressing because its address is fixed which lies in the data segment.rip
register which meant that the assembler and linker should cooperate to compute the offset of g
from the ultimate location of the current instruction which is pointed by rip
register.3. Epilogue/Exit: After the user code execution, the previous frame pointer is retrieved from the stack by
pop
instruction which we have stored in Line 2. pop
is equivalent to:mov X, [esp] ; put top stack item value into X
add esp, 4 ; increments ESP by 4 which is kind of deallocation
ret
instruction jumps back to the next instruction from where func()
called by retrieving the jump address from stack stored by call
instruction. ret
is subroutine instruction which is equivalent to:pop rip ;
jmp rip ;
If any return value specified then it will be stored in
eax
register which you can see in Line 16.So, this is it for “How C program converts into assembly?”. Although this kind of information is strictly coupled with compiler & ABI. But most of the compilers, ABI & instruction set architecture follows the same more or less. In case, you have not gone through my previous articles, here are simple FAQs helps you to understand better:
Q. How do you determine the stack growth direction ?
A. Simple…! by comparing the address of two different function’s local variables.
int *main_ptr = NULL;
int *func_ptr = NULL;
void func() { int a; func_ptr = &a; }
int main()
{
int a; main_ptr = &a;
func();
(main_ptr > func_ptr) ? printf("DOWN\n") : printf("UP\n");
return 0;
}
Q. How do you corrupt stack deliberately ?
A. Corrupt the SFR values stored in the stack frame.
void func()
{
int a;
memset(&a, 0, 100); // Corrupt SFR values stored in stack frame
}
int main()
{
func();
return 0;
}
Q. How you can increase stack frame size ?
A.
alloca()
is the answer. Google about it or see this. Although this is not recommended.