Heap Overflow and integer overflow
Attacks on applications are among the most common actions that hackers carry out. By taking advantage of an error in a program, an intruder can gain the access rights under which the program started. Programming bugs can leave data from the process memory open to attack. This chapter demonstrates how hackers use this type of error.
Memory segments
Every program has a specific amount of RAM memory at its disposal. When a program starts up, the system kernel creates a memory area for it and allocates memory to this as needed. One part of this memory contains the executable code of the program; another might contain its static data. This process is known as the division into memory segments. As we have already mentioned, a program uses five segments during its operation:
Program code (text) |
Initiation data (data) |
Non-initiation data (bss) |
Space for dynamic memory (heap) |
Stack |
They are located in the address space of the process in this order. The program code is placed on the very top, while lower addresses are added to the stack.
We will now take a closer look at the following program to learn what the individual segments are for (/CD/Chapter8/Listings/test.c).
As we can see, they are very numerous, as many as 27. These are not, however, only memory segments, but also segments of a binary file. Their initial addresses and sizes are constant, and therefore they can be written to the binary file. Heap segments and stack segments are dynamic, meaning that they change their size. Their initial address depends on the system, and therefore the information on those segments is not included in the binary file.
With the objdump program we can also check that our variables are located where we expect:
We can thus determine the addresses of the static and global variables. The dynamic variables placed on the stack or heap are created during the program function, so there in no possibility to access them on the basis of investigating the binary file itself.
Another program segment that we mentioned is text. It contains the executable code of the program; in other words, the subsequent instructions of the processor. Therefore no variables are stored in it. We can display its content also using objdump:
As we notice, the data here do not mean much to a human, but are understandable to a processor.
In order to see a more legible version of this segment, we can change it into assembly-language instructions using the -d option:
Here, objdump has demonstrated that there are many functions in the program (their body in the assembly language has been replaced with ellipsis). As programmers we have written only the code of the main() function. The rest has been added by the gcc compiler and constitutes part of the text segment.
Let’s have a closer look now at the heap segment.
Heap
As we know, to allocate memory in the heap segment we use the malloc() function. This is not, however, a function used by the kernel, but by the C language library. The target function made available by the kernel, used to allocate memory in the heap, is brk(). It assumes a new address for the end of the heap as a parameter. If we give it an address greater than the current end, it will allocate a new memory area. At other times, when we enter an address smaller than the end, a corresponding amount of memory will be released. Let’s assume we want to allocate 16 bytes of memory to the heap. In order to do that, we have to discover the current heap end and to transfer to the brk() function a value greater by 16. To discover the point where the heap ends, we can use the sbrk() function, which we transfer in the 0 parameter. Here is a program that executes these operations (/CD/Chapter8/Listings/test2.c):
We will now test our program to see if it really does allocate 16 bytes of memory:
As we can see, after executing the brk() function, the address of the heap end changes by 16; in other words, memory has been assigned. Defining the address of the heap end each time and transferring the appropriate argument of the brk() function is unnecessary. We can use the sbrk() function of the C library and enter the amount in bytes that we want to allocate. It will then perform these operations for us. The best solution, however, is to use the malloc() function, as in our first example. This located in each compiler, meaning that the programs written with it will always work. In the Linux system the malloc() function performs similar operations as sbrk(), but it also takes care not to allocate small memory areas too many times, to prevent memory fragmentation. Subsequent memory areas are allocated immediately next to each other. This carries with it some risk as described next.
Buffer overflow
After a successful termination, the malloc() function returns the address to the new memory area. Its subsequent calls allocate memory immediately next to previous areas. If our program copies data to the first buffer without checking its size, it can cause the second to be overwritten. We will now analyze the following program (/CD/Chapter8/Listings/heap.c):
At the beginning we allocate two buffers: buf1 and buf2. The first one is located under buf2 in the process memory. Next, we calculate the distance between buf1 and buf2 and assign the result to the “how much” variable. In this way we will know how many bytes of data we have to transfer to the program for copying so they overwrite buf2. The strcpy() function, which copies data from the first argument of the program to buf1, and the use of which is therefore quite risky, is located at the end of the code. Let’s test our program:
The first byte of buf2 is now the zero byte inserted by the strcpy() function, therefore, the program states that buf2 has no content. The content of the buffers after overwriting looks like this:
BBBBBBBB – buf1 | BBBBBBBB – gap | 0AAAAAAA – buf2 |
All we need to do is transfer a character sequence longer than 16 bytes, and the result will be visible:
After transferring 20 B characters, buf2 assumed the “BBBB” value, even though nowhere in the program did we perform such an entry. Our program in the example is therefore susceptible to heap overflow attacks. Now, we will see how we can put this to practical use.
An example of heap overflow
To take advantage of a heap overflow error in practice we have to have something to overwrite. On the heap there are no pointers that we can overwrite, as was true in the case of stack overflows, which we discussed in an earlier chapter. We can overwrite only that which we have already created ourselves. A frequently used technique is the overwriting of the names of the files used. They are often stored on a heap.
Let’s take a look at the program below that prints an appropriate amount of lines from the “file.txt” file (/CD/Chapter8/Listings/heap2.c):
This reads as many lines from the “file.txt” file as we enter in the first argument. We will now try to overwrite the file name in such a way that the program will open another one, for example /etc/passwd.
From the previous example we know that there is a gap of 8 bytes between buffers allocated by malloc(). We will, therefore, transfer 24 bytes to our
program to fill this, followed by the path to the file. The first fill character will be the line number we want to read:
For now this will work as the root user. If this program were located in a real system, we could gain, for example, access to encrypted system passwords:
If the password is easy, we can use the password cracker to gain full access to the system.
An example of bss overflow
The problem of buffer overflow is also an issue for the bss segment. If we do not limit the data being copied to the buffer located in the same segment, they will overwrite other memory areas not assigned to specific variables. The most frequent case of bss overflow is “function pointer overflow.” Let’s have a look at the following example (/CD/Chapter8/Listings/bss.c):
#include <stdio.h> #include <stdlib.h>
On the basis of the first argument, the program assigns an appropriate value to the function pointer. Next, it copies the function parameters into the static buffers and transfers them during the function call. The a and b buffers and the pointer of the func function are located in the bss segment. Before calling func(), the program executes strcpy(), which, as we already know, can overwrite the buffer.
Let’s test our program.
bash-2.05b$ ./bss add 2 2
4
bash-2.05b$ ./bss multiply -32 92
-2944
bash-2.05b$
This program for short data strings works perfectly. But what happens if we transfer a long character sequence as the third argument?
bash-2.05b$ ./bss multiply -32 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Violation of memory protection (core dumped)
The program will report a memory protection error. The A characters have been copied into the “b[SIZE]” buffer. The buffer size was insufficient to store such a sequence, and it therefore overwrote the memory area outside itself. The value of the func() pointer was the content of the overwritten memory. After calling func(), instead of jumping to the appropriate function, we jump to the address “AAAA.” We will now check this using the gdb program:
As we can see, our assumptions proved correct. The address 0x41414141 is not part of the memory assigned to our process, so during the attempt to access it, the system kernel killed our program. The example of a bss overflow shown above gives us more opportunities than a heap overflow would. If we overwrite the function pointer, we can direct the operation of the whole
program. Let’s try jumping to the subtract() function instead of the add() function by overwriting the func() pointer with its address. At the beginning we define the address of the divide() function:
gdb) print ÷
$1 = (<text variable, no debug info> *) 0x8048480 <divide>
We know that it is 0x8048480. Now, using a short Perl insert we transfer arguments we have prepared to the program:
bash-2.05b$ ./bss add 8 0002`perl -e ‘print “\x80\x84\x04\x08″x10’` 4 bash-2.05b$
Our second number to add is 0002<address_divide_function>; that is, after calling the atoi() function, simply 2. The atoi() function will change the character sequence into a whole number until it reaches the first character that is not a number. As we can see, we managed to induce subtraction instead of addition, despite the first argument commanding the program to execute something completely different. We should bear in mind that the address of the function being called is to be entered from the end.
We have commanded the program to execute operations due to the overwriting of the function pointer, but this has not yet given us anything of real benefit. Instead of using the program function, in the call argument we can transfer the binary code of our function, to which we will then jump. Our function will be used to start up the /bin/sh shell. As we know, such a representation of the function in the form of characters is called a shellcode. The following listing shows the exploit code that starts up the shell using the error in our program (/CD/Chapter8/Listings/exp_bss.c):
We place our shellcode in the environment variable so that determining its address in memory will be easy. Then we start up a vulnerable program with arguments “add,” “2,” <buffer with shellcode addresses>. The shellcode addresses overwrite the func() pointer that, instead of print(), runs our shellcode. Let’s check if it will work:
bash-2.05b$ gcc -o exp_bss exp_bss.c bash-2.05b$ ./exp_bss
sh-2.05b$ exit exit
bash-2.05b$
As can be seen, we have managed to start up the sh shell without significant problems. If the “bss” program were working with root privileges, we would obtain full access to the system resources.
In summary, like other errors, serious hackers should investigate heap and bss overflow errors, even though that they are often impossible or difficult to take advantage of. A lot of information is stored on the heap, and overwriting it can bring us benefits. Apart from the buffers created by our program, there is also information stored, for example, by the libc library. A clever hacker will use anything, even the smallest gap, to penetrate the system.
Is this question part of your Assignment?
We can help
Our aim is to help you get A+ grades on your Coursework.
We handle assignments in a multiplicity of subject areas including Admission Essays, General Essays, Case Studies, Coursework, Dissertations, Editing, Research Papers, and Research proposals
Header Button Label: Get Started NowGet Started Header Button Label: View writing samplesView writing samples