Basics for Binary Exploitation
we all know how C programs is get compiled.
- first your C file goes to the compiler, then compiler convert it into sequence of operation that will be executed by computer
- each operation compiled into sequence of bytes called
operation code
orOP code
Why Assembly ?
trying to read the instruction that our computer executing is impossible. Assembly is Language that designed for translating the instruction that our computer will execute into human readable language. In order to understand what is happening when executable is get executed , you must first understand the assembly of the executable.
Basic components of C program
- Heap
- Stack
- Registers
- Instructions
There are mainly Two architectures in Assembly
- x86 (32 bit) (will be covered in this blog)
- x64 (64 bit)
1. Heap
when functions like malloc(),calloc() called or the global or static are declared, or we can say the manual memory allocation this all goes to HEAP.
2. Registers
registers is small storage areas. this registers are use to store values of addresses or variables that can be represented by 8 bytes or less than 8 bytes there are total 6 general purpose registers:
- eax (Primary accumulator)
- ebx (Base Register)
- ecx (Counter Register)
- edx (Data Register)
- esi (Source Index)
- edi (Destination Index)
there are 3 registers Reserved for specific purpose
- ebp (Base Pointer)
- esp (Stack Pointer)
- eip (Instruction Pointer)
3. Stack
The stack is data structure that contain element that can be added (push) or removed (pop)
push
adds element on TOP of the stackpop
remove element from TOP of the stack
Each Stack element has its own address. Stack grows towards lower memory addresses, which means stack goes from high address values to low address values. Whenever a function is called, that function goes to the stack frame. In every stack frame there is esp
(stack pointer) which points to TOP of the stack and ebp
(Base pointer) which points to BASE of the stack. all the addresses outside the stack is considered as JUNK by the stack.
let’s understand stack frame by Code and execution process.
#include<stdio.h>
void fun(int x)
{
int a=0;
int b=x;
}
int main(){
fun(10);
}
- Now first of all in fun’s stack frame the value of argument i.e 10 will get stored
- Then the return address of function
fun
- Then the memory is get allocated for variable
a
(assume 4 bytes) - Now the value of variable
x
is not directly get stored in variableb
. the value of variablex
first get stored in general purpose register like eax or any then the value inside that general purpose register is get stored into variableb
The stack frame will look like (Numbers in red denoting the steps to follow)
x86 Assembly
Now this section will cover how your computer execute the code. There are 2 syntax of Assembly :
- AT & T
- Intel
we will use intel syntax in this section
Instructions
every Assembly instruction have two parts, operation followed by instruction E.g.
mov eax,0x5
1. mov
mov
instruction take 2 arguments
mov arg1,arg2
this instruction copy the value of argument 2 to argument 1
E.g.
mov eax,0x5
this is similar to eax = 5
Note: if we want to copy the Value of variable or register into another register we need to write the name of variable in []
square brackets. if we wont give square brackets, it will take the Address of variable or register
E.g. (1)
consider eax = 0, ebx = 5 address of ebx is 0x1793
mov eax,ebx
this above instruction will not store the value of ebx
to eax
. this instruction will store the address of ebx
i.e. 0x1793
to eax
. now eax
will be eax = 0x1793
E.g. (2)
consider eax = 0, ebx = 5 address of ebx is 0x1793
mov eax,[ebx]
this above instruction will store the Value of ebx
i.e. 5 to eax
.
now eax
will be eax = 5
2. add
add
instruction take 2 arguments
add arg1,arg2
this instruction add value of arg2
to arg1
and store it in arg1
E.g. (1)
consider eax = 2
add eax,0x5
after the above instruction get executed. the value of eax
will be
eax = eax + 5
i.e. eax = 2 + 5
, so eax = 7
E.g. (2)
consider eax = 4 ebx = 10
add ebx,eax
after the above instruction get executed. the value of ebx
will be
ebx = ebx + eax
i.e. ebx = 10 + 4
, so ebx = 14
3. sub
sub
instruction takes 2 arguments. it works similar as add
sub arg1,arg2
this instruction subtract value of arg2
to arg1
and store it in arg1
E.g. (1)
consider eax = 11
add eax,0x5
after the above instruction get executed. the value of eax
will be
eax = eax - 5
i.e. eax = 11 - 5
, so eax = 6
E.g. (2)
consider eax = 4 ebx = 10
sub ebx,eax
after the above instruction get executed. the value of ebx
will be
ebx = ebx - eax
i.e. ebx = 10 - 4
, so ebx = 6
4. push / pop
push
instruction take 1 argument
push arg
this instruction will push the arg
to TOP of the stack
E.g.
consider eax = 3
push eax
- when argument is given to a
push
,push
will decrement theesp
(stack pointer). - Note: decrement the
esp
what does it mean? the stack address goes from high to low
E.g.
stack frame is from 0x1735 to 0x1720
then esp
will start from 0x1735 and goes upto 0x1720
pop
instruction take register as argument
pop reg
this instruction will store the value of element which is on TOP of the stack into reg
then it will remove or pop that TOP element from the stack
E.g.
consider the top element on the stack is 3
pop eax
- when argument is given to a
pop
,pop
will store the value from top of the stack to a register that is given in argument - then
pop
will increase theesp
(stack pointer)
5. lea
lea
stands for load effective address. this instruction takes register and address as an argument
lea reg,addr
this instruction take address and store it into the given register
E.g
lea eax,0x1739
the address i.e 0x1739
will get stored into eax
register.
now eax
will be, eax = 0x1739
Control flow of Executable
- all if statements, loops and code come together in this instructions
- every instruction has instruction address
eip
(instruction pointer) contain the address of instruction that are currently being executed, then it will move to the next instruction
1. cmp
cmp
is a compare instruction which takes two arguments. it work same as sub
but rather storing the value, it will set flag
in flag register
the flags will be <0
or >0
or =0
cmp arg1,arg2
E.g (1)
cmp 1,3
after execution of this instruction what happens is 1-3 = -2
so flag register will store <0
.
E.g. (2)
cmp 5,2
so, 5-2 = 3
so flag register will store >0
.
E.g. (3)
cmp 5,5
so, 5-5 = 0
so flag register will store 0
Note: the cmp
(compare) instruction is always followed by jmp
instruction
2. jump
this instruction take address as an argument
jmp addr
this instruction will check the current state of the flag and accordingly jump (jump means basically changing the value of eip
(instruction pointer) to the given address) to the address.
Types of Jump instruction:
jne
-> jump not equal toje
-> jump equal tojg
-> jump greater thanjl
-> jump less than
E.g. (1)
cmp 1,3
jl addr27
- when above instruction is get executed the
eip
will stop atjl addr 27
as shown in image - now at this point the flag will got check which is
< 0
- so the condition is true because instruction is
jump less than
, soeip
will go toAddr 27
and execute theinstruction 27
E.g. (2)
cmp 5,2
jl addr27
- when the above instruction is get executed. the
eip
will stop atjl addr 27
as shown in image - now at this point the flag will got check which is
>0
- so the condition is false because instruction is
jump less
, soeip
will go toAddr 4
and execute theinstruction 4
- if the instruction has
jg
rather thanjl
then condition will true andeip
will go toAddr 27
and execute theinstruction 27
3. call
call
instruction calls a function whether its user defined or in built. call
takes function as an argument
call <func>
when the call
instruction is get executed. the instruction will push the address of function to stack and jump to that first instruction
4. leave / return
leave
instruction always followed with return
instruction
leave
- leave instruction will set
esp
(stack pointer) toebp
(base pointer) - means
leave
will destroy the stack frame
- leave instruction will set
return
- now as we can see the
return
is now on top of the stack. - means
return
pop itself from stack and set theeip
(instruction pointer) to that return address
- now as we can see the
Thank You! guys for reading my blog post on Binary Exploitation Basics
.
Resource
- x86 Assembly Crash Course, this blog is based on this video
- Try this for Visualization of Stack and Heap By running C code