Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / desktop / Win32

Post-Build Executable Back Patching

4.84/5 (11 votes)
23 Mar 2008CPOL12 min read 1   705  
Perform Advanced Post-Build Executable Processing with the DbgHelp Library

Introduction

Patching an executable after the build process is an effective way to customize the compilation and linking process. It is especially useful in cases where the environment's tools do not offer convenient extensibility. For example, in both Dynamic TEXT Section Image Verification and Tamper Aware and Self Healing Code, we were required to copy and paste derived results (the computed hash value over the .text section) back into the source files. This resulted in another build so that the proper signatures were embedded in the executable. Each time a source file changed, we were required to perform the additional step.

This article will present a command line tool to automate the back patching process. A command line tool was chosen over a Visual Studio AddIn because a command line tool also offers the ability to be scripted. The program will use DbgHelp.dll's symbol manager to locate memory addresses of a variable in the executable which requires patching. The program will then translate the virtual address output from DbgHelp.dll into a disk offset and write the values into the target executable.

The DbgHelp Library

The DbgHelp library provides services to debuggers and other system utilities to aide in working with debug information. The library is now considered part of the Operating System. Other libraries, such as ImageHlp.dll, route calls through the library. Unlike other operating system files, DbgHelp does not receive updates as other system files during upgrades and patches (though the library is under the purview of Windows File Protection).

The symbol file contains the same debugging information that an executable file would contain, except the information is stored in a debug (.dbg) file or a program database (.pdb) rather than the executable file. Neither Matt Pietrek or John Robbins have much to say with respect to the PDB format. However, we could use Undocumented Windows 2000 to write our own symbol parser since Schreiber reverse engineered the format.

The library is maintained by the Windows Operating System team. To download the latest version of the library, we need to download the latest <a target="_blank" href="http: www.microsoft.com="" whdc="" devtools="" debugging="" default.mspx"="">Debugging Tools for Windows (which includes WinDbg, kd, and cdb). Debugging Tools for Windows includes the latest header and lib files needed. Usenet questions regarding the library's use should be directed to <a href="news: microsoft.public.windbg"="">microsoft.public.windbg. Each time the PDB format is modified (for example, between Visual Studio releases), the latest version of the library will properly operate on the symbol files. As of this writing, the latest version is 6.8 dated October 2008.

Dll hell is alive and well with the library. This is because the library is not updated by update services. For example, on a fully patched Windows XP Pro workstation, Windows File Protection is guarding version 5.1.2600 of the file. If we try to overwrite the file, WPF will promptly restore version 5.1 from cache (dated August 2004). So what we find is tools such as IDA Pro, Debugging Tools for Windows, and Visual Studio carry around their own version of the library.

Image 1

Using a down level version of DbgHelp (for example, an outdated version in \System32) seems to most often result in Win32 error 87: "The parameter is incorrect", with an occasional error 126: "The specified module could not be found". Another issue related to versioning is "The procedure entry point SymInitializeW could not be located in the dynamic link library dbghelp.dll." In this case, we linked against the updated version of the library (from Debugging Tools for Windows), but used a down level version of the DLL.

When compiling and linking, we specify the location of the updated header files and LIB using the project's property page under Additional Paths. The location is usually C:\Program Files\Debugging Tools for Windows\SDK, \inc and \lib. Also note that we need to install the Windows SDK for Windows Server 2008 so that we have access to the latest enumerations (such as BasicType) referenced in DbgHelp.h. Visual Studio 2005 users should read Section 4.3 of the SDK's Release Notes. Finally, if we desire only the enumerations (without the SDK), we can pull them from the Debug Interface Access SDK, Enumerations and Structures, which does not appear to be a Microsoft download.

This version of DbgHelp is the version we will eventually carry around (as IDA Pro and Visual Studio), and not the version the Operating System will attempt to load if missing from the project (presumably in \System32). Because of the versioning issues, all samples have the library included in the archive in both the \Debug and \Release directories as shown in Figure 2.

Image 2
Figure 2: Sample Directory Structure with Library

February 2005's issue of Under the Hood featured an article by Matt Pietrek entitled, Visual Studio 2005 (Whidbey) and Programs Using DBGHELP. According to Pietrek, if using both DbgHelp and ImageHlp libraries, we should include the DbgHelp library first to ensure the desired version of the library is loaded.

The Sample Programs

There are three sample programs with this article. The first program exercises the DbgHelp library. Sample two has two projects - a target for patching (appropriately named Target.exe) and the program which performs the patch (BackPatch.exe). The final sample has one project - Target.exe, which uses the BackPatch.exe program from sample two in a post-build step.

Sample 1

The first sample uses hard coded parameters to demonstrate the symbol library. It's sole purpose is to familiarize ourselves with the functions we will need from the library, which are listed below.

  • SymInitialize
  • SymSetOptions
  • SymSetSearchPath
  • SymLoadModule
  • SymGetModuleInfo
  • SymFromName
  • SymUnloadModule
  • SymCleanup

The code below shows us the steps we take to determine the virtual address of a symbol named variable in filename Target.exe. Often we will encounter it as Target!variable. Below, variable is declared global and HANDLEID is defined as 1. The handle provides a context for the library in case the debugger is managing multiple clients. We could use GetCurrentProcess, but the constant is more expedient.

int variable = 0;

string symbol = "variable";
string filename = "Target.exe";
string path = "C:\\...\\BackPatch\\debug";

SymInitialize( HANDLEID, NULL, FALSE );

We pass NULL to SymInitialize because we will set the search path later. FALSE is passed for fInvadeProcess: we do not need the library to enumerate all loaded modules, effectively calling the SymLoadModule function for each module.

Once we initialize the library, we chose to undecorate names and load our desired search path. If we were interested in source line numbers, we would OR in SYMOPT_LOAD_LINES. SYMOPT_DEBUG would send additional information to the output window though OutputDebugString. See <a target="_blank" href="http: msdn2.microsoft.com="" en-us="" library="" cc266462.aspx"="">Setting Symbol Options in MSDN for a complete list of available options.

SymSetOptions( SymGetOptions() | SYMOPT_UNDNAME );

Next we specify a directory search path.

SymSetSearchPath( HANDLEID, path );

If we specify a directory which does not have enough granularity, we may receive incorrect results since DbgHelp will search for and possibly find incorrect symbols for the release build of Target.exe in the debug directory (and vice versa). In Figure 3 below, a path of "...\ExeFunctionMap Files\ExeFunctionMap\" was passed to SymSetSearchPath. We run the release build, but DbgHelp loads the first symbols it finds for the executable, which are debug symbols.

Figure 3
Figure 3: DbgHelp Search Path Results

Next we call on the library to load the symbol table for the module. After that is a call to SymGetModuleInfo. Obviously, this would provide us with information on the loaded module and symbol table. An image does not need to be in memory for the call to SymLoadModule. Finally, SymLoadModule has been superseded by SymLoadModule64, which has been superseded by SymLoadModuleEx.

DWORD64 dwBaseAddress = SymLoadModule64( HANDLEID, 0, filename, NULL, 0, 0 );

IMAGEHLP_MODULE64 im = { sizeof(im) };
SymGetModuleInfo64( HANDLEID, dwBaseAddress, &im );

We then build a structure so that we can query the library for information on the symbol variable. The name value of structure SYMBOL_INFO is variable in length, so we must supply a buffer that is large enough to hold the name stored at the end of the SYMBOL_INFO structure. The source code for the format of the buffer has the details which were taken from MSDN (which were omitted for clarity). Finally, the query is accomplished in the call to SymFromName.

TCHAR szSymbolName[MAX_SYM_NAME];
ULONG64 buffer[...];
PSYMBOL_INFO pSymbolInfo = (PSYMBOL_INFO)buffer;

StringCchCopy( szSymbolName, MAX_SYM_NAME, symbol );
pSymbol->SizeOfStruct = sizeof(SYMBOL_INFO);
pSymbol->MaxNameLen = MAX_SYM_NAME;

SymFromName( HANDLEID, szSymbolName, pSymbolInfo );

cout << "    Symbol Name: " << pSymbolInfo->Name << endl;
cout << "Virtual Address: 0x" << pSymbolInfo->Address << endl;

Finally, we unload the module and shut down the library with SymUnloadModule and SymCleanup.

SymUnloadModule64( HANDLEID, dwBaseAddress );

SymCleanup( HANDLEID );

Output of the first sample is shown in Figure 4. Note that even though we moved the executable (and DbgHelp.dll) into the root of C:, the symbol engine correctly locates the PDB file which is displayed under loaded modules as C:\...\BackPatch\BackPatch.pdb.

Image 4
Figure 4: Sample 1 Output

Sample 2

Image 12Our second sample is most interesting. It consist of two projects - a driver and a patcher. Two projects were used because it is easier to distribute both collectively.

The driver program is named Target.exe displays the value of an integer declared in its source files. The listing of Target.cpp is shown below, with its output shown in Figure 5.

INT variable = 0xFFBB8844;

int main(int argc, _TCHAR* argv[])
{
    cout << "variable: 0x";
    cout << HEX(8) << variable << endl;

    cout << " Address: 0x";
    cout << HEX(8) << (INT*) &variable << endl;

    return 0;
}
Image 11
Figure 5: Target.exe Ouptut

BackPatch.cpp contains the symbol engine code introduced in sample one. Since we want BackPatch.exe to stand alone we added Chris Losinger's <a target="_blank" href="http: www.codeproject.com="" kb="" recipes="" ccmdline.aspx"="">CCmdLine for command line parsing after adding some upgrades to support Unicode. The supported switches in BackPatch are as follows:

Switch Meaning
-s symbol name
-f file which will be back patched
-p path which will be passed to DbgHelp
-n new value
-v verbose output
-t test only

A project rooted at C:\BackPatch which wishes to patch an integer symbol named 'variable' with a new value of 0xF0F00F0F into Target.exe would be formed as follows: BackPatch.exe -s variable -f Target.exe -p "C:\BackPatch" - n 0xF0F00F0F.

We are going to use the Project's Property Page to pass arguments to the program using the provided macros. Visual Studio places outputs in either \Debug or \Release, so we will use the $(ProjDir) macro. See Figure 6.

Image 13
Figure 6: Command Line Parameters

$(ProjDir) is the root directory (where our solution resides) with either \Debug or \Release appended. We wrap the macro in quotes because we may encounter long file names. Figure 7 shows the results of running sample two in a solution rooted at C:\Local Shared\BackPatch\BackPatch Sample Files\BackPatch 2\.

Image 14
Figure 7: Sample 2 Output

Once DbgHelp determines the virtual address of interest (0x419004 above), we move to the disk file. DbgHelp has yielded DbgHlpTargetAddress (the virtual address) and DbgHlpBaseAddress. We then normalize the virtual address to obtain a relative virtual address:

ULONG DbgHlpVirtAddr = DbgHlpTargetAddress - DbgHlpBaseAddress;

Our next step is to map a view of the disk file into memory for Read and Write access. We have seen this code many times before:

/////////////////////////////////////////////////////////////
hFile = CreateFile( filename, GENERIC_READ | GENERIC_WRITE,
    FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0);

if ( hFile == INVALID_HANDLE_VALUE ) { ... }

/////////////////////////////////////////////////////////////
hFileMapping = CreateFileMapping( hFile, NULL,
    PAGE_READWRITE , 0, 0, NULL );

if ( NULL == hFileMapping ) { ... }

/////////////////////////////////////////////////////////////
pDiskBaseAddress = MapViewOfFile( hFileMapping,
    FILE_MAP_WRITE | FILE_MAP_READ, 0, 0, 0 );

if ( NULL == pDiskBaseAddress ) { ... }

Next we parse the DOS and NT Headers to find the Section Headers. Once we obtain the the first header, we loop over the headers searching for the header which contains our derived DbgHlpVirtAddr. We already know this will be located in the .data or .rdata section:

/////////////////////////////////////////////////////////////
PIMAGE_SECTION_HEADER pSectionHeader =
    IMAGE_FIRST_SECTION( pNTHeader );

UINT nSectionCount = pNTHeader->FileHeader.NumberOfSections;

for( UINT i = 0; i < nSectionCount; i++ )
{
    // Find the Section...
    if( pSectionHeader->VirtualAddress <= DbgHlpVirtAddr &&
        DbgHlpVirtAddr < pSectionHeader->VirtualAddress +
                         pSectionHeader->Misc.VirtualSize )
    {
    
        // We found the Section Header
        ...
    }
}

Using [relative] VirtualAddress above tells us where to expect the variable to resides once the executable has been mapped into memory. We are concerned with where the allocation exists on a disk file. For that, we simply use the RawPointer members. However, PointerToRawData alone is not enough - it only tells us where the section starts on disk. We still need an offset into the section. For the offset we go back to the expected in-memory layout.

ULONG dwOffset = DbgHlpVirtAddr - pSectionHeader->VirtualAddress;
ULONG dwDiskLocation = pSectionHeader->PointerToRawData + dwOffset;

Finally, given dwDiskLocation, we cast to an INT pointer and write the new value:

*((INT*)dwDiskLocation) = 0x00000000;

Figure 8 shows the results of running BackPatch.exe in sample two, providing a new value of 0xFFAAFF.

Image 9
Figure 8: Result of Sample Two

In Figure 9, we examine the disk file of Target.exe after running BackPatch.exe. We do this so we can verify the value in the disk file. Note that from Figure 8, we calculate a disk location of 0x1C18. In Figure 9, we see the value 0xFFAAFF has been written.

Image 10
Figure 9: Hex Dump of Program

In Figure 10, we alternately run Target/BackPatch/Target. Target.exe's variable starts with a value of 0xF0F00F0F. After the patch, the value is 0xAA0000AA.

Image 11
Figure 10: Print/Patch/Print

Sample 3

Sample 3 examines the effects of back patching an executable after the build process. We start by dropping BackPatch.exe and DbgHelp.dll into the project's directory output directory (either \Debug or \Release). Next, we add a Post-Build Event which calls our command line tool (see Figure 11). The command line we enter is "$(OutDir)\BackPatch.exe" -s variable -f $(TargetFileName) -p "$(OutDir)" -n 0x1111111 -v.

Image 20
Figure 11: Post-Build Event

After compiling and linking, our tool is invoked, as shown in Figure 12. For Visual Studio, we should drop the -v and provide a -q switch for suppressed output.

Image 20
Figure 12: Tool Invocation

Running sample three provides the expected result: 0x11111111 is displayed. Next, we examine the Checksum of the file using a tool such as PEChecksum presented in An Analysis of the Windows PE Checksum Algorithm. In Figure 13, we see the checksum is incorrect.

Image 24
Figure 13: PE Checksum

In addition, filetimes will have been altered. This does not pose a problem in the Visual Studio environment (such as a file time invoking additional compilations). If we were patching a CLR assembly utilizing Strong Names, we would want to run before sn.exe was invoked by the linker. See related Strong Name Assemblies (Assembly Signing) and AssemblyDelaySignAttribute Class. As exercises left to the reader, the tool can be enhanced with the following:

  • Add a quiet (-q) switch
  • Save and restore original filetimes
  • Recalculate the PE checksum

Recall that ImageHlp routes calls into DbgHelp, so if recalculating the checksum using ImageHlp's CheckSumMappedFile, we must include the DbgHelp library before the ImageHlp library. An alternative is to calculate the checksum ourselves with an equivalent function. Such a function was presented in Grafting Compiled Code: Unlimited Code Reuse. In the later article, we grafted the compiled code of CheckSumMappedFile into our own executable.

Downloads

  • <a href="http://www.codeproject.com/KB/cpp/PEChecksum/PEChecksum.zip">Download Program - PEChecksum - 108 kB

Checksums

  • BackPatch1.zip
    • MD5: 2AF82B1C938ECEE614950491B33393DC
    • SHA-1: 2BDEAC1AD7EF3A63576F9CA1A7F12C732CA0D33B
  • BackPatch2.zip
    • MD5: 059B9B3578CCA165FCE4ACB2099F0364
    • SHA-1: 18ACF5B093817267FED962EB03DFA5D2DDA14B57
  • BackPatch3.zip
    • MD5: 114CFBC047B4EC6B5B0131B5B44C6E90
    • SHA-1: D4E12F5001895075C30F17FBF45D6B789AAE9FEF

Revisions

  • 03.23.2008 Added DIA SDK Information
  • 03.16.2008 Initial Release

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)