Click here to Skip to main content
15,997,744 members
Articles / Web Development / HTML

CrashRptEx - An Extension to the CrashRpt Crash Reporting System

Rate me:
Please Sign up or sign in to vote.
5.00/5 (8 votes)
23 May 2018LGPL38 min read 35.4K   4   43   6
How to use CrashRptEx, to avoid some of the pitfalls of crash reporting in MFC apps or if you want the ability to continue your application after a crash

Introduction

The project introduced here is a mere extension of the excellent CrashRpt crash reporting system by Mike Carruth and zexspectrum who have both written CodeProject articles (here and here). It deals with some MFC and SysWOW64 specific pitfalls and adds support for continuing execution after a crash.

Background

Before coming across crashrpt, I had rolled my own code partly based on Hans' Dietrich's XCrashRpt. When the time came to add additional functionality, I looked around for well-supported but lightweight alternatives and considered several of the candidates that are listed in the crashrpt Wiki. To cut a long story short - I liked crashrpt best by a long shot due to its relative simplicity which can be partly attributed to the fact that it targets Windows applications only allowing direct use of many of the Microsoft debugging tools in crash analysis.

It turned out that integrating it with my application was very straightforward and using the available documentation and that there was almost no fiddling with special cases. However, my application was created in MFC and runs both on 32bit machines, on 64bit machines in SysWOW64 and in native 64bit mode. As it turns out, both the use of MFC and running in SysWOW64 lead to a few issues that need to be dealt with to catch all crashes and correctly report their origin.

I also had the additional requirement that I wanted to allow users to continue execution after a crash. While this is generally considered a bad idea because the program's memory is most likely corrupted, it makes perfect sense in some scenarios. My users may have prepared for an experiment for days and run it for hours and if the application crashes because a division by zero in some minor online analysis or because I forgot to catch an exception somewhere, this work would be in vain even if I would allow save already acquired data the way WORD does it through Microsoft's "Dr.Watson".

Of course, this is only the second best solution next to writing better software but the mere existence of crash reporting systems shows that crashes do occur. Also in some environments, it may be necessary for users to use beta or alpha state software for their everyday work because doing it with the risk of failure is still better than not doing it at all.

Still, crashrpt was so far ahead of everything I had programmed and was thoroughly tested and maintained by zex spectrum who turned out to also be extremely responsive to both proposals and the few minor issues I found in the original version. I therefore decided to introduce the required extensions to CrashRpt and used the derivative in my program. After several months of using it without problems, I feel it is time to share this comparatively small contribution and what I know about the MFC issues with the community in the hope that it turns out to be useful to someone.

Using the Code

The code can be used in the same way as crashrpt, which has excellent documentation, FAQ and Wiki entries in addition to the articles mentioned above. The additional features are optional and documented in the CrashRptEx.h header. They are also briefly described below along with some background information.

In a Nutshell

In a nutshell, the following new options and functions are added.

C++
CRASHRPTAPI(int) crAllowContinue(DWORD dwFlags);
CRASHRPTAPI(int) crDiscardError(CR_EXCEPTION_INFO &ei);
CRASHRPTAPI(int) crHandleError(CR_EXCEPTION_INFO &ei);  

The first function chooses (for the current thread) whether the crash handler should allow program execution to continue. The exact behavior is controlled by dwFlags which is a combination of:

(1) One of

CR_INST_APP_CONTINUE (User chooses, default is termination)
CR_INST_APP_CONTINUE_DEFAULT (User chooses, default is to continue)
CR_INST_APP_CONTINUE_ONLY (Always continue)
CR_INST_APP_TERMINATE (Terminate the application.)

In case one of the first three is chosen, a CR_EXCEPTION_INFO is thrown instead of terminating the application which may be caught by an appropriate catch clause in the applications calls stack.

(2) ... and optionally

CR_INST_APP_CONTINUE_NOSENDER (Do not call crash sender, throw exception info.)

Which will not terminate the application and cause the crash sender not to be launched, even if CR_INST_APP_TERMINATE is chosen. Instead, the application can call crHandleError in the catch clause which will behave according to the flags described in (1) or crDiscardError (e.g. after logging the error) to quietly continue execution independently of these flags.

C++
int crEnableProcessCallbackFilter(BOOL bEnable);
int crProcessCallbackFilterStatus();  

To disable/enable an exception filter (this is a known Microsoft bug, see below) which is quietly swallowing exceptions raised in Windows callback routines in apps running under SysWOW64.

C++
WNDPROC crInstallWndProcWrapper(pfnWndProc);
int crEnableWndProcWrapper(BOOL bEnable);
int crWndProcWrapperStatus();

To install (the first of these is actually a macro dealing with the peculiarities of MFC), enable or disable a wrapper around the windows procedure implementing the aforementioned catch clause. This is necessary because Microsoft's Dr. Watson is launched when an exception occurs behind a call into the (MFC?) windows procedure never giving CrashRpt a chance.

C++
CRASHRPTAPI(int) crModifyFlags(DWORD dwFlags, DWORD dwMask);

Call any time during program execution. This function modifies the flags of an already installed crash handler without re-installation (which would require re-adding all files, etc.). This is mainly there because it is needed for the above functionality but can come in handy independently.

The ability to allow program execution to continue is demonstrated in both the WTL and the MFC test applications while all other features are only demonstrated in the MFC version. Let us have a look at how to integrate the new functionality in your application focusing on the latter test app.

CrashRpt was designed to catch crashes and launch the CrashSender application that keeps the crashed process alive until it has collected all information configured to be sent from it, allowing it to end and then sending the information about the process and crash by the configured method. The CrashSender application is launched from the crash handler installed by the library. In order to allow the application to continue from a defined position (e.g., by unwinding the stack all the way to the message loop), a few changes were necessary to CrashRpt. First and foremost, the crash handlers should no longer terminate the process. Let's look at a typical crash handler as installed by CrashRpt(Ex):

C++
// Structured exception handler
LONG WINAPI CCrashHandler::SehHandler(PEXCEPTION_POINTERS pExceptionPtrs)
{ 
    CCrashHandler* pCrashHandler = CCrashHandler::GetCurrentProcessCrashHandler();
    ATLASSERT(pCrashHandler!=NULL);  

    if(pCrashHandler!=NULL)
    {
        // Acquire lock to avoid other threads (if exist) to crash while we are 
        // inside. 
        pCrashHandler->CrashLock(TRUE);

        CR_EXCEPTION_INFO ei;
        memset(&ei, 0, sizeof(CR_EXCEPTION_INFO));
        ei.cb = sizeof(CR_EXCEPTION_INFO);
        ei.exctype = CR_SEH_EXCEPTION;
        ei.pexcptrs = pExceptionPtrs;

#ifdef CRASHRPT_EX
        // AS: Error report generation now in _s_HandleError
        _s_HandleError(ei, pCrashHandler);
#else
        pCrashHandler->GenerateErrorReport(&ei);

        // Terminate process
        TerminateProcess(GetCurrentProcess(), 1);    
#endif
    }   

    // Unreacheable code  
    return EXCEPTION_EXECUTE_HANDLER;
} 

The code of the original version is still visible in the #else clause. The original error report creation and the call to terminate the process have been moved to a function as there is some more code involved which is repeated in every crash handler. The new function is:

C++
void _s_HandleError(CR_EXCEPTION_INFO &ei, CCrashHandler * pCrashHandler)
{
  // Maybe we should later add an option for which types of crashes we should
  // allow to continue execution.
  DWORD dwFlags = pCrashHandler ->IsContinueAllowed();
  
  if (dwFlags)
  {
      pCrashHandler ->ModifyFlags(dwFlags, CR_INST_APP_CONTINUE_MASK);
      if (dwFlags & CR_INST_APP_CONTINUE_NOSENDER)
          throw ei;
  }
  
  // This will return as soon as the screenshot was set
  pCrashHandler->GenerateErrorReport(&ei);
  // If continue is allowed, we wait for the launcher to finish and
  // read out the exit code
  if (dwFlags)
  {
      WaitForSingleObject(ei.hSenderProcess, INFINITE);
      DWORD dwExitCode = 1;
      GetExitCodeProcess(ei.hSenderProcess, &dwExitCode);
      if (dwExitCode & CR_INST_APP_CONTINUE)
      {
          pCrashHandler->CrashLock(FALSE);
          pCrashHandler ->ChangeGUID();
          throw ei;
      }
  }
          
  switch(ei.exctype)
  {
  case CR_CPP_NEW_OPERATOR_ERROR:
  case CR_CPP_INVALID_PARAMETER:
      pCrashHandler->CrashLock(FALSE);
  default:
      ;
  }
  // Terminate process
  TerminateProcess(GetCurrentProcess(), 1);    
}

First, the handler checks whether continuation is allowed at all (we will see later that this may differ from thread to thread and can be changed at runtime at any time) and modifies the crash handler flags to pass this information on to the crash sender when it is launched. The next block...

C++
if (dwFlags & CR_INST_APP_CONTINUE_NOSENDER)
          throw ei;

...is due to another new option of the system which only makes sense if one has the intention of allowing applications to continue: If CR_INST_APP_CONTINUE_NOSENDER is selected, the sender is not launched in the crash handler but the exception information is thrown and can be caught further up the call stack. The feature and its uses will be described below in more detail. If the option is not chosen, we launch the crash sender. It will display this new option to the user and if continuing is not set as the only option, the user can modify the default choice:

Image 1

If the user has chosen to continue the application, we unlock the crash handler and change its GUID (CCrashHandler::ChangeGUID() is a new function introduced for this purpose. It is necessary because a second crash could occur in the application later and we would like it to not overwrite the first crash in the queue and to be recognized as different by the crash analysis). The exception information is then thrown. The application can catch CR_EXCEPTION_INFO references wherever unwinding should stop and can then decide whether to really continue execution based on it.

If the application is not to be continued, TerminateProcess() is called after unlocking the crash handler where necessary.

Image 2

Pitfalls When Catching Crashes (Not Only) in MFC Apps

Image 3

Exceptions in SysWOW64

When running 32bit applications on a 64bit, the process will usually swallow exceptions behind a window callback. This is a windows bug which may stay in place because it has been there for a long time and some applications may have started to rely on it. For details, see this Microsoft KB article including a hotfix and this forum entry. CrashRptEx contains the functions crEnableProcessCallbackFilter and crProcessCallbackFilterStatus that check for the presence of the hotfix and enable/disable the Callback Filter which is responsible for swallowing the exceptions.

To ensure that exceptions pass the callback (and thus are recognized by CrashRpt), enable the hotfix as follows:

C++
int ret = crEnableProcessCallbackFilter(FALSE);

You can query the state of the hotfix at any time by calling:

C++
int ret = crProcessCallbackFilterStatus(); 

A return value of zero indicates that exceptions will pass a callback:

C++
//! 1: Hotfix present and filter active
//! 2: Hotfix not present, filter active if this is an affected system,
//!    not active otherwise
//! 0: Hotfix present, filter not active

Exceptions Behind a Windows Callback

If your application uses MFC (and possibly not only then), there are additional issues that will cause the application to crash and Microsoft's Dr. Watson to kick in before CrashRpt has a chance to identify the problem because an exception handler (for certain exceptions) is included in the code calling the windows procedure. This behavior can also be caused in applications that were not previously affected when hooking the Windows procedure using SetWindowsHookEx. The solution is to include

History

  • 05/06/2012: Initial version

License

This article, along with any associated source code and files, is licensed under The GNU Lesser General Public License (LGPLv3)


Written By
Germany Germany
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
Question5 for you Pin
Taulie10-May-13 3:06
Taulie10-May-13 3:06 
AnswerRe: 5 for you Pin
Taulie10-May-13 11:15
Taulie10-May-13 11:15 
GeneralWin 8 Pin
Taulie10-May-13 20:46
Taulie10-May-13 20:46 
GeneralMy vote of 5 Pin
Ștefan-Mihai MOGA9-May-12 3:24
professionalȘtefan-Mihai MOGA9-May-12 3:24 
QuestionSorry for the premature version Pin
Andreas Schoenle8-May-12 5:04
Andreas Schoenle8-May-12 5:04 
AnswerRe: Sorry for the premature version Pin
skyformat99@gmail.com8-May-12 19:47
skyformat99@gmail.com8-May-12 19:47 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.