When you run any C-program, its executable image loaded into RAM of computer in an organized manner which known as process address space or memory layout of C program. Here I have tried to show you the same thing in two parts .
/!\: Originally published @ www.vishalchovatiya.com.
In the Part 1 i.e. "Overview", we will see segment-wise overview & in Part 2 i.e. "Example", we'll see How C program stored in RAM memory? with example.
Part 1: Overview
The memory layout of C program organized in the following fashion:
Note: It’s not just these 4 segments, there are a lot more but these 4 are the core to understand the working of C program at the machine level.
HIGHER ADDRESS
+------------------------+
| Unmapped or reserved | Command-line argument & Environment variables
|------------------------|------------------------
| Stack segment | |
| | | | Stack frame
| v | v
| |
| ^ | ^
| | | | Dynamic memory
| Heap segment | |
|------------------------|------------------------
| Uninitialized data |
|------------------------| Data segment
| Initialized data |
|------------------------|------------------------
| |
| Text segment | Executable code
| |
+------------------------+
LOWER ADDRESS
main()
too), both user-defined and system.There are two subsections of this segment
1. Initialized data
char string[ ] = "hello world"
and a statement like an int count=1
outside the main
(i.e. global) would be stored in initialized read-write area.const int A=3
; makes the variable A
read-only and to be stored in initialized read-only area.2. Uninitialized data (BSS segment)
int A
would be stored in the uninitialized data segment. A statement like static int X=0
will also be stored in this segment cause it initialized with zero./lib/ld-linux.so.2
)malloc()
, calloc()
, realloc()
and new
for C++) resides.free()
or delete
). Freed memory goes back to the heap but doesn’t have to be returned to OS (it doesn’t have to be returned at all), so unordered malloc
/free
eventually, cause heap fragmentation. You can learn more about how malloc works here.main()
in your C program. I have written a detailed article about the function stack frame here.Part 2: Example
We have taken a simple example as shown in tittle along with its memory layout. As we discussed in the previous part(i.e. Overview) how executable image of our program divided into the different segment and stored in memory(RAM). Now we understand those blocks by using our example code presented above.
Loader
$ readelf --segments ./a.out
Elf file type is EXEC (Executable file)
Entry point 0x8048300
There are 9 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4
INTERP 0x000154 0x08048154 0x08048154 0x00013 0x00013 R 0x1
[Requesting program interpreter: /lib/ld-linux.so.2]
LOAD 0x000000 0x08048000 0x08048000 0x00608 0x00608 R E 0x1000
LOAD 0x000f08 0x08049f08 0x08049f08 0x00118 0x00124 RW 0x1000
DYNAMIC 0x000f14 0x08049f14 0x08049f14 0x000e8 0x000e8 RW 0x4
NOTE 0x000168 0x08048168 0x08048168 0x00020 0x00020 R 0x4
GNU_EH_FRAME 0x0004c4 0x080484c4 0x080484c4 0x00044 0x00044 R 0x4
GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x10
GNU_RELRO 0x000f08 0x08049f08 0x08049f08 0x000f8 0x000f8 R 0x1
Section to Segment mapping:
Segment Sections...
00
01 .interp
02 .interp .note.ABI-tag .hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .plt.got .text .fini .rodata .eh_frame_hdr .eh_frame
03 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss
04 .dynamic
05 .note.ABI-tag
06 .eh_frame_hdr
07
08 .init_array .fini_array .jcr .dynamic .got
.data
, .bss
, .text
, etc. segments are there. But a stack segment is not shown as its created at a run time & decided by OS(precisely loader & kernel).INTERP
in the program header defines the name & path of loader which going to load the current binary image into the RAM by reading these segments. Here it is /lib/ld-linux.so.2.
When you compile C code, you get executable image(which may be in any form like
.bin
, .exe
, .hex
, .out
or no extension etc). This executable image contains text segment which you see by Binutils command $ objdump -d <binary_name>
and it looks like follows:.....
080483f1 <main>:
80483f1: 8d 4c 24 04 lea 0x4(%esp),%ecx
80483f5: 83 e4 f0 and $0xfffffff0,%esp
80483f8: ff 71 fc pushl -0x4(%ecx)
.....
This is executable instructions stored in the text segment as a read-only section and shared by the processes if requires. These instructions read by CPU using program counter and stack frame created in the stack at the time of execution. Program-counter points to the address of the instruction to executed which lies in the text segment.
Initialized Data segment
const int x = 1
; stored in the read-only area. So you can not modify it accidentally.char str[] = "Hi!";
& static int var = 0;
stored in the read-write area because we don’t use a keyword like const which makes variable read-only.Uninitialized Data segment
int i
declared global goes to this area of storage because it is not initialized or initialized to zero by default.malloc()
, calloc()
, etc.malloc()
function and stored its address in pointer ptr
to keep track of that memory or to access it.ptr
is a local variable of main hence it’s in main’s stack frame, but memory pointed by it is in a heap which I have shown by *ptr
.main()
, which is also a function hence, stack frame is created for it while execution. Although there are many functions called before main which I have discussed here.main()
is created before function func()
as we called it nested.func( )
execution overs its local variable a and its stack frame will destroy(rewind is a precise word here), same goes for main()
function also.Intuitive FAQs
Q. How do you determine the stack growth direction ?
A. Simple…! by comparing the address of two different function’s local variables.
int *main_ptr = NULL;
int *func_ptr = NULL;
void func() { int a; func_ptr = &a; }
int main()
{
int a; main_ptr = &a;
func();
(main_ptr > func_ptr) ? printf("DOWN\n") : printf("UP\n");
return 0;
}
Q. How do you corrupt stack deliberately ?
A. Corrupt the SFR values stored in the stack frame.
void func()
{
int a;
memset(&a, 0, 100); // Corrupt SFR values stored in stack frame
}
int main()
{
func();
return 0;
}
Q. How you can increase stack frame size ?
A. alloca() is the answer. Google about it or see this.