Introduction
Patching an executable after the build process is an effective way to
customize the compilation and linking process. It is especially useful in
cases where the environment's tools do not offer convenient extensibility.
For example, in both Dynamic
TEXT Section Image Verification and Tamper
Aware and Self Healing Code, we were required to copy and paste
derived results (the computed hash value over the .text section) back into
the source files. This resulted in another build so that the proper
signatures were embedded in the executable. Each time a source file changed,
we were required to perform the additional step.
This article will present a command line tool to automate the back
patching process. A command line tool was chosen over a Visual Studio AddIn
because a command line tool also offers the ability to be scripted. The
program will use DbgHelp.dll's symbol manager to locate memory addresses of a
variable in the executable which requires patching. The program will then
translate the virtual address output from DbgHelp.dll into a disk offset and
write the values into the target executable.
The DbgHelp Library
The DbgHelp library provides services to debuggers and other system
utilities to aide in working with debug information. The library is now
considered part of the Operating System. Other libraries, such as
ImageHlp.dll, route calls through the library. Unlike other operating system
files, DbgHelp does not receive updates as other system files during upgrades
and patches (though the library is under the purview of Windows File
Protection).
The symbol file contains the same debugging information that an executable
file would contain, except the information is stored in a debug (.dbg) file
or a program database (.pdb) rather than the executable file. Neither Matt
Pietrek or John Robbins have much to say with respect to the PDB format.
However, we could use Undocumented Windows 2000 to write
our own symbol parser since Schreiber reverse engineered the format.
The library is maintained by the Windows Operating System team. To
download the latest version of the library, we need to download the latest <a
target="_blank"
href="http: www.microsoft.com="" whdc="" devtools="" debugging="" default.mspx"="">Debugging
Tools for Windows (which includes WinDbg, kd, and cdb). Debugging Tools
for Windows includes the latest header and lib files needed. Usenet questions
regarding the library's use should be directed to <a
href="news: microsoft.public.windbg"="">microsoft.public.windbg. Each time
the PDB format is modified (for example, between Visual Studio releases), the
latest version
of the library will properly operate on the symbol files. As of this writing,
the latest version is 6.8 dated October 2008.
Dll hell is alive and well with the library. This is because the library
is not updated by update services. For example, on a fully patched Windows XP
Pro workstation, Windows File Protection is guarding version 5.1.2600 of the
file. If we try to overwrite the file, WPF will promptly restore version 5.1
from cache (dated August 2004). So what we find is tools such as IDA Pro,
Debugging Tools for Windows, and Visual Studio carry around their own version
of the library.
Using a down level version of DbgHelp (for example, an outdated version in
\System32) seems to most often result in Win32 error 87: "The parameter is
incorrect", with an occasional error 126: "The specified module could not be
found". Another issue related to versioning is "The procedure entry point
SymInitializeW could not be located in the dynamic link library dbghelp.dll."
In this case, we linked against the updated version of the library (from
Debugging Tools for Windows), but used a down level version of the DLL.
When compiling and linking, we specify the location of the updated header
files and LIB using the project's property page under Additional Paths. The
location is usually C:\Program Files\Debugging Tools for Windows\SDK, \inc
and \lib. Also note that we need to install the Windows
SDK for Windows Server 2008 so that we have access to the latest
enumerations (such as BasicType) referenced in DbgHelp.h. Visual Studio 2005
users should read Section 4.3 of the SDK's Release Notes. Finally, if we
desire only the enumerations (without the SDK), we can pull them from the Debug Interface
Access SDK, Enumerations
and Structures, which does not appear to be a Microsoft download.
This version of DbgHelp is the version we will eventually carry around (as
IDA Pro and Visual Studio), and not the version the Operating System will
attempt to load if missing from the project (presumably in \System32).
Because of the versioning issues, all samples have the library included in
the archive in both the \Debug and \Release directories as shown in Figure
2.
|
Figure 2: Sample Directory Structure
with Library |
February 2005's issue of Under the Hood featured an article by Matt
Pietrek entitled, Visual
Studio 2005 (Whidbey) and Programs Using DBGHELP. According to
Pietrek, if using both DbgHelp and ImageHlp libraries, we should include the
DbgHelp library first to ensure the desired version of the library is
loaded.
The Sample Programs
There are three sample programs with this article. The first program
exercises the DbgHelp library. Sample two has two projects - a target for
patching (appropriately named Target.exe) and the program which performs the
patch (BackPatch.exe). The final sample has one project - Target.exe, which
uses the BackPatch.exe program from sample two in a post-build step.
Sample 1
The first sample uses hard coded parameters to demonstrate the symbol
library. It's sole purpose is to familiarize ourselves with the functions we
will need from the library, which are listed below.
- SymInitialize
- SymSetOptions
- SymSetSearchPath
- SymLoadModule
- SymGetModuleInfo
- SymFromName
- SymUnloadModule
- SymCleanup
The code below shows us the steps we take to determine the virtual address
of a symbol named variable in filename Target.exe. Often we
will encounter it as Target!variable. Below, variable is
declared global and HANDLEID is defined as 1. The handle provides a context
for the library in case the debugger is managing multiple clients. We could
use GetCurrentProcess, but the constant is more expedient.
int variable = 0;
string symbol = "variable";
string filename = "Target.exe";
string path = "C:\\...\\BackPatch\\debug";
SymInitialize( HANDLEID, NULL, FALSE );
We pass NULL to SymInitialize because we will set the search path
later. FALSE is passed for fInvadeProcess: we do not need the
library to enumerate all loaded modules, effectively calling the
SymLoadModule function for each module.
Once we initialize the library, we chose to undecorate names and load our
desired search path. If we were interested in source line numbers, we would
OR in SYMOPT_LOAD_LINES. SYMOPT_DEBUG would send additional
information to the output window though OutputDebugString. See <a
target="_blank"
href="http: msdn2.microsoft.com="" en-us="" library="" cc266462.aspx"="">Setting
Symbol Options in MSDN for a complete list of available options.
SymSetOptions( SymGetOptions() | SYMOPT_UNDNAME );
Next we specify a directory search path.
SymSetSearchPath( HANDLEID, path );
If we specify a directory which does not have enough granularity, we may
receive incorrect results since DbgHelp will search for and possibly find
incorrect symbols for the release build of Target.exe in the debug directory
(and vice versa). In Figure 3 below, a path of "...\ExeFunctionMap
Files\ExeFunctionMap\" was passed to SymSetSearchPath. We run the
release build, but DbgHelp loads the first symbols it finds for the
executable, which are debug symbols.
|
Figure 3: DbgHelp Search Path
Results |
Next we call on the library to load the symbol table for the module. After
that is a call to SymGetModuleInfo. Obviously, this would provide us
with information on the loaded module and symbol table. An image does not
need to be in memory for the call to SymLoadModule. Finally,
SymLoadModule has been superseded by SymLoadModule64, which
has been superseded by SymLoadModuleEx.
DWORD64 dwBaseAddress = SymLoadModule64( HANDLEID, 0, filename, NULL, 0, 0 );
IMAGEHLP_MODULE64 im = { sizeof(im) };
SymGetModuleInfo64( HANDLEID, dwBaseAddress, &im );
We then build a structure so that we can query the library for information
on the symbol variable. The name value of structure
SYMBOL_INFO is variable in length, so we must supply a buffer that
is large enough to hold the name stored at the end of the
SYMBOL_INFO structure. The source code for the format of the
buffer has the details which were taken from MSDN (which were
omitted for clarity). Finally, the query is accomplished in the call to
SymFromName.
TCHAR szSymbolName[MAX_SYM_NAME];
ULONG64 buffer[...];
PSYMBOL_INFO pSymbolInfo = (PSYMBOL_INFO)buffer;
StringCchCopy( szSymbolName, MAX_SYM_NAME, symbol );
pSymbol->SizeOfStruct = sizeof(SYMBOL_INFO);
pSymbol->MaxNameLen = MAX_SYM_NAME;
SymFromName( HANDLEID, szSymbolName, pSymbolInfo );
cout << " Symbol Name: " << pSymbolInfo->Name << endl;
cout << "Virtual Address: 0x" << pSymbolInfo->Address << endl;
Finally, we unload the module and shut down the library with
SymUnloadModule and SymCleanup.
SymUnloadModule64( HANDLEID, dwBaseAddress );
SymCleanup( HANDLEID );
Output of the first sample is shown in Figure 4. Note that even though we
moved the executable (and DbgHelp.dll) into the root of C:, the symbol engine
correctly locates the PDB file which is displayed under loaded modules as
C:\...\BackPatch\BackPatch.pdb.
|
Figure 4: Sample 1 Output |
Sample 2
Our second sample is most interesting. It consist of two
projects - a driver and a patcher. Two projects were used because it is
easier to distribute both collectively.
The driver program is named Target.exe displays the value of an integer
declared in its source files. The listing of Target.cpp is shown below, with
its output shown in Figure 5.
INT variable = 0xFFBB8844;
int main(int argc, _TCHAR* argv[])
{
cout << "variable: 0x";
cout << HEX(8) << variable << endl;
cout << " Address: 0x";
cout << HEX(8) << (INT*) &variable << endl;
return 0;
}
|
Figure 5: Target.exe Ouptut |
BackPatch.cpp contains the symbol engine code introduced in sample one.
Since we want BackPatch.exe to stand alone we added Chris Losinger's <a
target="_blank"
href="http: www.codeproject.com="" kb="" recipes="" ccmdline.aspx"="">CCmdLine for
command line parsing after adding some upgrades to support Unicode. The
supported switches in BackPatch are as follows:
Switch | Meaning |
-s | symbol name |
-f | file which will be back patched |
-p | path which will be passed to DbgHelp |
-n | new value |
-v | verbose output |
-t | test only |
A project rooted at C:\BackPatch which wishes to patch an integer symbol
named 'variable' with a new value of 0xF0F00F0F into Target.exe would be
formed as follows: BackPatch.exe -s variable -f Target.exe -p "C:\BackPatch"
- n 0xF0F00F0F.
We are going to use the Project's Property Page to pass arguments to the
program using the provided macros. Visual Studio places outputs in either
\Debug or \Release, so we will use the $(ProjDir) macro. See Figure
6.
|
Figure 6: Command Line Parameters |
$(ProjDir) is the root directory (where our solution resides)
with either \Debug or \Release appended. We wrap the macro in quotes because
we may encounter long file names. Figure 7 shows the results of running
sample two in a solution rooted at C:\Local Shared\BackPatch\BackPatch Sample
Files\BackPatch 2\.
|
Figure 7: Sample 2 Output |
Once DbgHelp determines the virtual address of interest (0x419004 above),
we move to the disk file. DbgHelp has yielded DbgHlpTargetAddress
(the virtual address) and DbgHlpBaseAddress. We then normalize the
virtual address to obtain a relative virtual address:
ULONG DbgHlpVirtAddr = DbgHlpTargetAddress - DbgHlpBaseAddress;
Our next step is to map a view of the disk file into memory for Read and
Write access. We have seen this code many times before:
hFile = CreateFile( filename, GENERIC_READ | GENERIC_WRITE,
FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0);
if ( hFile == INVALID_HANDLE_VALUE ) { ... }
hFileMapping = CreateFileMapping( hFile, NULL,
PAGE_READWRITE , 0, 0, NULL );
if ( NULL == hFileMapping ) { ... }
pDiskBaseAddress = MapViewOfFile( hFileMapping,
FILE_MAP_WRITE | FILE_MAP_READ, 0, 0, 0 );
if ( NULL == pDiskBaseAddress ) { ... }
Next we parse the DOS and NT Headers to find the Section Headers. Once we
obtain the the first header, we loop over the headers searching for the
header which contains our derived DbgHlpVirtAddr. We already know
this will be located in the .data or .rdata section:
PIMAGE_SECTION_HEADER pSectionHeader =
IMAGE_FIRST_SECTION( pNTHeader );
UINT nSectionCount = pNTHeader->FileHeader.NumberOfSections;
for( UINT i = 0; i < nSectionCount; i++ )
{
if( pSectionHeader->VirtualAddress <= DbgHlpVirtAddr &&
DbgHlpVirtAddr < pSectionHeader->VirtualAddress +
pSectionHeader->Misc.VirtualSize )
{
...
}
}
Using [relative] VirtualAddress above tells us where to expect
the variable to resides once the executable has been mapped into memory. We
are concerned with where the allocation exists on a disk file. For that, we
simply use the RawPointer members. However,
PointerToRawData alone is not enough - it only tells us where the
section starts on disk. We still need an offset into the section. For the
offset we go back to the expected in-memory layout.
ULONG dwOffset = DbgHlpVirtAddr - pSectionHeader->VirtualAddress;
ULONG dwDiskLocation = pSectionHeader->PointerToRawData + dwOffset;
Finally, given dwDiskLocation, we cast to an INT pointer and
write the new value:
*((INT*)dwDiskLocation) = 0x00000000;
Figure 8 shows the results of running BackPatch.exe in sample two,
providing a new value of 0xFFAAFF.
|
Figure 8: Result of Sample Two |
In Figure 9, we examine the disk file of Target.exe after running
BackPatch.exe. We do this so we can verify the value in the disk file. Note
that from Figure 8, we calculate a disk location of 0x1C18. In Figure 9, we
see the value 0xFFAAFF has been written.
|
Figure 9: Hex Dump of Program |
In Figure 10, we alternately run Target/BackPatch/Target. Target.exe's
variable starts with a value of 0xF0F00F0F. After the patch, the
value is 0xAA0000AA.
|
Figure 10: Print/Patch/Print |
Sample 3
Sample 3 examines the effects of back patching an executable after the
build process. We start by dropping BackPatch.exe and DbgHelp.dll into the
project's directory output directory (either \Debug or \Release). Next, we
add a Post-Build Event which calls our command line tool (see Figure 11). The
command line we enter is "$(OutDir)\BackPatch.exe" -s variable -f
$(TargetFileName) -p "$(OutDir)" -n 0x1111111 -v.
|
Figure 11: Post-Build Event |
After compiling and linking, our tool is invoked, as shown in Figure 12.
For Visual Studio, we should drop the -v and provide a -q
switch for suppressed output.
|
Figure 12: Tool Invocation |
Running sample three provides the expected result: 0x11111111 is
displayed. Next, we examine the Checksum of the file using a tool such as
PEChecksum presented in An Analysis of
the Windows PE Checksum Algorithm. In Figure 13, we see the checksum
is incorrect.
|
Figure 13: PE Checksum |
In addition, filetimes will have been altered. This does not pose a
problem in the Visual Studio environment (such as a file time invoking
additional compilations). If we were patching a CLR assembly utilizing Strong
Names, we would want to run before sn.exe was invoked by the linker. See
related Strong
Name Assemblies (Assembly Signing) and
AssemblyDelaySignAttribute
Class. As exercises left to the reader, the tool can be enhanced
with the following:
- Add a quiet (-q) switch
- Save and restore original filetimes
- Recalculate the PE checksum
Recall that ImageHlp routes calls into DbgHelp, so if recalculating the
checksum using ImageHlp's CheckSumMappedFile, we must include the
DbgHelp library before the ImageHlp library. An alternative is to calculate
the checksum ourselves with an equivalent function. Such a function was
presented in Grafting Compiled
Code: Unlimited Code Reuse. In the later article, we grafted the
compiled code of CheckSumMappedFile into our own executable.
Downloads
- <a
href="http://www.codeproject.com/KB/cpp/PEChecksum/PEChecksum.zip">Download
Program - PEChecksum - 108 kB
Checksums
- BackPatch1.zip
- MD5: 2AF82B1C938ECEE614950491B33393DC
- SHA-1: 2BDEAC1AD7EF3A63576F9CA1A7F12C732CA0D33B
- BackPatch2.zip
- MD5: 059B9B3578CCA165FCE4ACB2099F0364
- SHA-1: 18ACF5B093817267FED962EB03DFA5D2DDA14B57
- BackPatch3.zip
- MD5: 114CFBC047B4EC6B5B0131B5B44C6E90
- SHA-1: D4E12F5001895075C30F17FBF45D6B789AAE9FEF
Revisions
- 03.23.2008 Added DIA SDK Information
- 03.16.2008 Initial Release