Open source software protection - Part 2

Kamal Shankar

3.28/5 (31 votes)

May 2, 2003

8 min read

167101

14922

This is a better alternative to my previous soulution

Screenshot of Loader32 Dialog

Screenshot of the Dialog based implementation of Loader32 Code

Introduction

This is an update to my previous article on protection - Open source software protection system where you will find all the main ideas outlined. This article tries to implement a software based solution to realize the idea(s) that I had received since I submitted the article

Before I proceed, I must stress upon the fact that this protection system must not be viewed as a production level copy protection - there is no way this software "copy protects" any software - one can make unlimited copies of programs "protected" with this protection system - only that none of them will run correctly except on the computer on which the program was supposed/authorized to run on.

My previous License for viewing the document stands and I would like to add the following clause too:

You must agree to post any major/important changes that you make to the source code provided.
Commercial implementation of this protection system by third parties must authorize the use from me first.

By major/important code changes I mean anything which makes the code more efficient or stable or effective etc.

Having agreed to the License, you may now proceed.

Background

This article takes into account that you have gone through my previous ramblings, and been able to fathom what exactly I am up to ;) Also, if you wish to understand the code provided with this article, a working knowledge of the PE System /Layout is necessary if you want to modify the sources, otherwise you can just Build the code and post suggestions and bug reports anyway!

The main logic behind the protection system is that the heart of any PE file is 1 (or more) machine executable sections called the code section residing in usually section named .text or .code etc. During execution of this section the Win32 Program loader looks for a special RVA named AddressOfEntryPoint, which is the RVA from where code section is started. I may emphasize that the PE Header is actually an array of structures really, and to learn about the PE File you just need to rope in the member variables.

Now if the program entry point is missing, in all possibility, the Win32 Loader will refuse to run it. What this program (EXEProt) does is leave everything in the PE file intact except the code following from AddressOfEntryPoint - which it separates from the EXE file(for the time being - Please see Loader32Dlg.cpp of Loader232 Dialog implementation to understand why I had to put it in a separate file), and encrypt it using Blowfish with the Machine HardwareID Key.

The Loader32 program just does a CreateProcess() on the Protected file with the Suspend state flag on, so that CreateProcess() just stops at the Program entry point(which contains binary NULLs for the time being) , and then using a single WriteProcessMemory() writes the decrypted code (obtained by decrypting "code_sec.dat" contents using the current Hardware key) into the process memory, and resumes the process. Hence, if the Loader32 Program is being run on the same machine on where HarewareID program was run, the correct decrypted data will be obtained, else the decrypted data will be garbage and the Protected program will crash.

Well actually after Matitiahu Allouche pointed out the lack of user friendliness in such a approach in his post the program has now been updated to alert the user if he is running the program on a different (unauthorised) machine.

The following dialog is displayed on such a computer:

Outcome when the Loader was run on an unauthorised computer

Using the code

The sources have been provided and they are FULLY qualified code - to get a working demo, just compile them!

The current implementation is supposed to be externally applied, ie, the software developer does not need to make any changes to his code to use this protection. Implementing the protection is extremely easy for both the software publisher and the user himself

It consists of the following steps:

The interested customer downloads HardwareID program to his computer, and runs it. A file named SysData.DAT containing system hardware info alongwith a MD5 Key (which should be unique to the computer) is created. The user is encouraged to go through the contents. This has been done to honour his privacy of information. If he is satisfied with it, he would send the file to the software publisher. Here I should add that tampering of SysData.DAT contents can be easily detected as the HardwareID Key is actually the MD5 Digest of the first string contaning the hardware details.
The software publisher takes(and verifies) the key and using it "protects" the application program file(s) using EXEProt application. As per the current implementation, four files - Real.EXE (the program file having everything but code section), Patch.DAT (containing the Virtual size of the code section), Code_Sec.DAT ( the encrypted code section) and code_sig.sig containing the MD5 digests of the encrypted code_sec.dat and also of it's decrypted contents are created.
The publisher now chooses if loading the protected file needs user intervention, then he can use the Loader32Dialog implementation, else he uses the fast and quiet Loader32Service program to dynamically load the protected file(s)
The publisher can change the filename, and registry constants into the above mentioned programs to reflect that of the current application. All variables have been clearly documented, and the constants are #defined, so it will take only a few seconds to change them. He can alternatively, continue using the default names (if it suits him)
The publisher sends the customer the Loader and the other three files
The protection system is in place

The heart of the CEXEProtector is the following function:

BOOL CExeProtector::DumpSection(DWORD nFileOffset, 
    DWORD dwNumberOfBytesToRW, DWORD /*dwRVA*/)
{
  /*
  Previously, this function used file streams, 
  but due to severe performance bottlenecks, I 
  have replaced them with
  Win32 APIs. - Kamal Shankar 17th June 2003
  Note: The program after the above modification was not as 
  extensively tested as with 
  previously file stream based versions.  
  szMyFileName contains the original EXE filename which we are to protect.
  We will read the WHOLE contents of szMyFileName into a 
  unsigned char array - ucOriginalBuffer.
  ucOriginalBuffer is a 1:1 mapping of the file on disc, 
  and by directly loading the file into 
  memory, we will be boosting program performance like anything.
  Now all furthur operations will be solely memory based.
  */


  HANDLE hFile = CreateFile(szMyFileName,GENERIC_READ,FILE_SHARE_READ,
    NULL,OPEN_EXISTING,FILE_ATTRIBUTE_NORMAL, NULL);
  if(!hFile) 
  {
    AfxMessageBox(("File open error!\n["+(CString)szMyFileName+"]") 
        ,MB_OK|MB_ICONINFORMATION);
    return FALSE;
  }  
  DWORD dwBufferSize=GetFileSize(hFile,NULL),dwBytesRead=0;
  if(dwBufferSize==-1) return FALSE;//Could not get file size


  unsigned char *ucOriginalBuffer = 
    new unsigned char[dwBufferSize]; //Check for exceptions


  if(!ucOriginalBuffer) 
  {
    MessageBox(NULL,"There was an error in dynamically"
     " allocating memory block.",
     "Not enough memory",MB_OK|MB_ICONSTOP); 
    return FALSE;
  }
  BOOL bResult=ReadFile(hFile,ucOriginalBuffer,dwBufferSize,
     &dwBytesRead,NULL);
  CloseHandle(hFile);
  if(!bResult||(dwBufferSize!=dwBytesRead)) 
    {MessageBox(NULL,"There was an error in reading" 
        "the file contents into memory.",
    "File I/O Error",MB_OK|MB_ICONSTOP);
    delete[] ucOriginalBuffer;return FALSE;}

  unsigned char *ucTempHexBuffer = 
    new unsigned char[dwNumberOfBytesToRW]; //Check for exceptions

  if(!ucTempHexBuffer) 
    {MessageBox(NULL,
    "There was an error in dynamically allocating memory block.",
    "Not enough memory",MB_OK|MB_ICONSTOP);
    delete[] ucOriginalBuffer;return FALSE;}

  ZeroMemory(ucTempHexBuffer,dwNumberOfBytesToRW);
  CopyMemory(ucTempHexBuffer,(ucOriginalBuffer+nFileOffset),
    dwNumberOfBytesToRW);
  ZeroMemory((ucOriginalBuffer+nFileOffset),dwNumberOfBytesToRW);

  BF_Encrypt(&ucTempHexBuffer,szKey,dwNumberOfBytesToRW); //szKey


  

hFile=CreateFile(szProtectedFileName,GENERIC_WRITE,FILE_SHARE_WRITE,
   NULL,OPEN_ALWAYS,FILE_ATTRIBUTE_

NORMAL,NULL);
  if(!hFile) 
  {
    AfxMessageBox(("File open error!\n["+(CString)szProtectedFileName+"]") 

,MB_OK|MB_ICONINFORMATION);
    return FALSE;
  }
  bResult=WriteFile(hFile,ucOriginalBuffer,dwBufferSize,
    &dwBytesRead,NULL);
  if((!bResult)||(dwBufferSize!=dwBytesRead)) 
      //Read dwBytesRead as dwBytesWritten

  {
    AfxMessageBox(("File write error!\n["+(CString)szProtectedFileName+"]") 

,MB_OK|MB_ICONSTOP);
    return FALSE;
  }
  CloseHandle(hFile);

  if(ucTempHexBuffer!=NULL) delete[] ucTempHexBuffer;
  ucTempHexBuffer=NULL;
  if(ucOriginalBuffer!=NULL) delete[] ucOriginalBuffer;
  ucOriginalBuffer=NULL;

  return TRUE;
}

As obvious to anybody who has seen my previous version, this function has been rewritten fully.

I have scratched out stream based file I/O in favour of the faster Win32 APIs, and the original file is read ONLY once !

All encryption and protection procedures are done DIRECTLY from memory (RAM) - this way program performance has shot up by 129.78 % in the worst case scenario (15 MB File reference) !

I am reading in the original file wholly, once, into memory. I am also obtaining the file offset corresponding to the Program "Entry Point". As the file loaded in memory is a 1:1 mapping of the image on disc, I read out the code section, encrypt it into code_sec.dat, zero out this section, and write it into a file named Real.EXE.

The file, code_data.dat is decrypted by EXEProt using the key supplied into the program dialog TextBox.

Note to users of previous version.

As mentioned in this artcle, I have fully rewritten the code to the ExeProt module. However, I have still kept the older version. It's still named "ExeProt_Src.zip", and is available here. The new, rewritten code is present in file "ExeProtNew_Src.zip". You can download it from here.

Program BUGs and workarounds

I am very happy that people have found BUGs in my programs. Thank you. They have been listed here, and possible workarounds too have been stated. So try them till I know how to fix these up !

BUG: Processor speeds variable due to processor load/power management:

Details:

Thank you Bryan Cook and Ries.

Bryan's timeBeginPeriod() hack/test was really great ! I see a great programmer here :=) What he did was, set up a high precision multimedia timer and simultaneously ran my system fingerprinting code too (which also set up a high priority multimedia timer to guess the processor speed). As a result, the precision of the multimedia timers drop, resulting in a different system fingerprint than when his multimedia timer was not running !

Ries on the otherhand, brought laptop power management features to my notice. In his case, when he was running his i686 from batteries, the power management circuits dynamically changed CPU voltage as per system load, thus varying the CPU speed.

Reason:

Actually the Processor speed enumeration is done by the SysInfo class written by Paul Wendt, and the processor speed determining function calculateCpuSpeed() comes from AMD ! It incidentally uses multimedia timers too.

In a nutshell here's what it does:
- Sets the priority to high,so that there will be a greater chance of getting the correct CPU speed.
- Function iterates till it gets two results in a row that are identical.

But believe me, it works well (except in cases like you have brought forward)

Workaround:

The function call to CheckMachine(char* szSystemInfoDumpString,char* szSystemKey,BYTE nMaskBits), nMaskBits denotes the information to mask and can take the following values:

  Value  Masks
  -----  -------
   00  Nothing (copies ALL system info)
   01  OS Name
   02  Processor speed
   03  Processor string
   04  Total RAM on system
   05  Drive names
   06  Volume names
   07  Volume serial numbers 
  
  The default value is 0 (defined in the corresponding Header)

If you have reason to believe that the processor speed may vary, then replace the default value of '00' to '02'

Thus for Bryan and Ries (and others facing the same problem), by passing '02' to nMaskBits, you should get a consistent system fingerprint.

History

5th May 2003 - Updated (rewrote) the whole article, and also some of the source
15th May 2003 - Updated article+links and code of ALL files based on your helpful feedback :)
23rd June 2003 - Rewrote program code. All protection procedures moved into a class, and minor file I/O scratched. Those operations NOW directly done in RAM.