Click here to Skip to main content
15,867,308 members
Articles / Web Development / HTML

Building a Mixed-Mode Sampling Profiler

Rate me:
Please Sign up or sign in to vote.
4.90/5 (19 votes)
31 Jul 2014Ms-PL17 min read 63.5K   1.9K   41   34
Walking a native and a managed callstack is fairly easy. Walking a mixed-mode callstack is much much harder. Existing documentation is truly minimal. I hope this article and its sample profiler can shed some light in this area.

Image 1

Introduction

Are you interested in how profilers work and how to walk mixed-mode applications? We will explore the ICorProfiler interface for doing a mixed-mode stack walking. Due to limited documentation, I had to learn some things the hard way. In this article I will share some of the DOs and DONTs, limitations, and possibilities, links to resources, and things I have learned.

Background

This is a continuation of an article where we learned how to traverse a mixed-mode (unmanaged/managed) application using IDebugInterface and IXCLRProcessData. The reason I started with this was the need to optimize a C++/C# application, which was slow at importing large data files. Neither a native profiler nor a CLR profiler run separately could pin-point the problem. A Mixed-Mode Stackwalk with IDebugClient Interface which was my first attempt to walk a mixed-mode callstack resulted in a partial success. I could see function calls of the CLR library but my own code appeared camouflaged as some calls to a CLR function where my JITted code was being executed. In this article we will instead explore a profiler API of the CLR runtime, called ICorProfiler.

Definitions

Mixed-Mode Callstack

When we talk about a mixed-mode callstack, we refer to a non pure managed application. A truly managed application only contains managed code. A managed application, using some sort of interop/pinvoke, is not pure. It is a mixed-mode application. COM, C++/CLI, Interop, and/or PInvoke can be used to mix the two worlds. Native code that calls managed code, which in turn calls native code again. We will illustrate this with a sample.

Image 2

We start from the bottom, where a C++ app makes a call into a C++/CLI managed DLL, the managed DLL in turn calls native code. Now we have two transitions already. If this native code uses a COM object written in C#, we get yet another transition, and so it can continue. In the normal case, you would only have 1-2 transitions between managed/unmanaged code. If you get 5-6 layers, you probably have an indicator that you should redesign.

Profiling

Profiling is a form of dynamic analysis of the execution of a software. It can measure memory consumption, how many times a function has been called, the duration of a function call, and much more. It is generally used to guide optimization by pin-pointing the worst CPU hogs. Profiling can be done at source code level or at binary level.

The application I wanted to optimize had been "profiled" before, at source code level, by writing the stacktrace to disk in functions we thought was called often. Not very sophisticated, but it worked. Its weakness was that it only supported native code. Unfortunately I had reason to believe that the problem was in managed code.

Instrumenting Profiling

My own experience goes back to the 90s. I used a native profiler called Quantify from Rational. That profiler did instrumentation on a binary level by patching function entry/leave to do logging. The result was a new binary, and every library that was dynamically loaded was also automatically patched.

Event Based Profiling

Nowadays, languages running in a virtual machine have really gained ground. Native profilers are not good for those languages. An instrumenting profiler would profile also the virtual machine. Those languages have their own profiler, to profile code that is run by the VM. The VMs usually provide callbacks for profilers to inform about JIT-compilation, object creation, enter/leave.

Sampling Profiling

Sampling is when you take samples (walk the stack) at regular intervals (every 5-20 ms). This is a very fast method since it lets the application run at almost full speed. If the interval is small, the data collected should be quite accurate.

ICorProfiler

Exploring a sample profiler

Now we are going to build a profiler of both native and managed code. The documentation for ICorProfiler is minimal and the samples are few. Fortunately, I found an example here at CodeProject: Creating a Custom NET Profiler. Respect! Unfortunately, it is a managed-only profiler, but it serves to give an understanding how the profiler API is supposed to be used.

The profiler was extremely easy to use. Profiling was started by setting two environment variables:

c:\>SET COR_ENABLE_PROFILING=1
c:\>SET COR_PROFILER={C6DBEE4B-017D-43AC-8689-3B107A6104EF}
c:\>SampleApp.exe

The first one is pretty obvious. It enables profiling. The second one points to a specific profiler implementation by its CLSID.

The project was supplied with a sample app, called Hello World. Unfortunately, I could not use this profiler implementation for several reasons.

  • It was only for managed code
  • It was too slow
  • I needed control over the start and stop
  • I wanted to avoid deployment involving the Registry
  • I wanted to learn how to do the programming

In my particular case, I didn't have a need to measure the performance of the overall execution. Most of the time the application was waiting for user input. In particular it was the performance of the import/restore of a database I wanted to have a closer look at. I needed to start profiling immediately before the import, and stop it immediately after. When the app starts, a lot of data is loaded from the database and filesystem, and some internal data structures are built up. The import uses the same components and to some degree the same functions to do checking data integrity. If I start the profiler and let it run until the end of the import, I will get a big profile trace, which is not useful for me. I will not be able to see if it was generated from the application startup or from the import.

ICorProfiler Documentation

A good start is a Profiling Overview, where we learn that the profiler must be implemented as a COM DLL. This DLL runs in-process, meaning that it is loaded into the process space of the application you want to profile. The profiler uses the Profiler API of the CLR, and gets notifications. Furthermore, the profiler must be written in native/unmanaged code, otherwise its own execution would trigger events in the CLR.

The pages describe two ways of profiling. Using the Enter/Leave/Tail callbacks or using DoStackSnapshot. I decided to go for the latter because I found a guideline Profiler Stack Walking in the .NET Framework 2.0: Basics and Beyond on how to walk a mixed stack.

Exploring the Interfaces

A good starting point is Profiling Interfaces at MSDN.

There seems to be a lot of different interfaces, but they are in reality a few because many of them are just extensions of the interfaces.

  • ICLRProfiling
  • ICorProfilerCallback (1,2,3)
  • ICorProfilerInfo (1,2,3)

The reason there are several versions of the same interface is backward compatibility. An interface must never change. If it would, components written for the old interface would break. In order to add functionality, you can create a new interface containing the new functions. There are other options to implement versioning, but this is how Microsoft implements it with ICorProfiler:

  • ICorProfilerCallback is from CLR V1.0 and forward
  • ICorProfilerCallback2 is from v2.0 and forward
  • ICorProfilerCallback3 is from v4.0 and forward.

ICLRProfiling

The first class that appears is ICLRProfiling. It is the start Interface. It just contains one method.

MC++
CLRProfiling::AttachProfiler
HRESULT AttachProfiler(
  [in] DWORD dwProfileeProcessID,
  [in] DWORD dwMillisecondsMax,                     // optional
  [in] const CLSID * pClsidProfiler,
  [in] LPCWSTR wszProfilerPath,                     // optional
  [in] size_is(cbClientData)] void * pvClientData,  // optional
  [in] UINT cbClientData);                          // optional

It attaches a profiler to a running process. We have the options to specify a file path, meaning that we can avoid the Registry.

Creating an ICLRProfiling Object

CLR Hosting API

We are going to use the CLR Hosting API. It's an API that lets an unmanaged application integrate a CLR runtime into itself. Not only can you startup the CLR inside a native application, but you can also replace the garbage collector, add global exception handlers, and much more. There is an interesting book regarding the CLR Hosting API: Customizing the Microsoft® .NET Framework Common Language Runtime.

ICLRMetaHost

The CLR Hosting APIs are provided by MSCorEE.dll. It is a thin version-independent API. The first thing to do is to create an ICLRMetaHost object. This object is the door to the CLR.

MC++
#include <Metahost.h>
ICLRMetaHost* m_metahost = nullptr;
HRESULT hr = CLRCreateInstance(CLSID_CLRMetaHost, IID_ICLRMetaHost, (LPVOID*)&m_metahost);

ICLRMetaHost::GetRuntTime

The ICLRMetaHost object contains functions such as EnumerateInstalledRuntimes and GetRuntime, the latter starts a CLR runtime in the current process. Let's do that.

MC++
ICLRRuntimeInfo* m_runtimeInfo = nullptr;
hr = m_metahost->GetRuntime(_T("v4.0.30319"), IID_ICLRRuntimeInfo, (LPVOID*)&m_runtimeInfo);

ICLRProfiling

We can use the ICLRRuntime object to create objects that interact with the running CLR runtime. For doing profiling, we are going to need the ICLRProfiling object.

MC++
ICLRProfiling* m_clrProfiling = nullptr;
hr = m_runtimeInfo->GetInterface(CLSID_CLRProfiling, IID_ICLRProfiling, (LPVOID *)&m_clrProfiling);

ICLRProfiling::AttachProfiler

MC++
LPVOID pvClientData = NULL;
DWORD cbClientData = 0;
CLSID clsidProfiler;
HRESULT hr;
hr = CLSIDFromString(L"{C6DBEE4B-017D-43AC-8689-3B107A6104EF}", (LPCLSID)&clsidProfiler);
hr = m_clrProfiling->AttachProfiler(
          m_processId,
          timeout,
          &clsidProfiler,
          NULL, // or full path "C:\\Development\\DiagProfiler.dll"
          pvClientData /* used to send a parameter to the profiler */,
          cbClientData /* the size in bytes of pvClientData*/ );

The ICLRProfiling object has an AttachProfiler method. Since at this point I didn't have an implementation of a profiler, I tried attaching an existing implementation, the Net Profiler.

Attachprofiler needs the CLSID of the Profiler DLL. If the CLSID is in the Registry, the file path can be NULL, otherwise you must supply the absolute file path.

It did not work with the Net Profiler! Running the code returned a failure code of 0x80070002: "The system cannot find the file specified". After verifying the file path a 100 times, Googling the error code, double checking the CLSID too, I found the problem. I had not read the manual for AttachProfiler properly. Among its error codes, is to be found the following error code:

HRESULT_FROM_WIN32(ERROR_FILE_NOT_FOUND)
The target process does not exist or is not running a CLR that supports attachment.
This may indicate that the CLR was unloaded since the call to the runtime enumeration method.

I would have wanted to see a more accurate error code for this. The problem was that my process did not support attachment. The reason was that I had tried to attach a profiler from the application that I wanted to be profiled. This was not allowed according to the API.

In my second attempt, I tried to attach to another process. It didn't work either. The reason was that the profiler that I had was old. It was written for the CRL v2.0 profiling interfaces. Attaching a profiler to running processes is only supported by CLR v4.0 and forward. The sample profiler didn't implement this interface.

The Solution

First of all, if it wasn't for some great sources on the internet such as Profiler Stack Walking in .NET Framework 2.0: Basics and Beyond and Building a mixed mode stack walker I would probably not have finished my own implementation. The sources are theoretical, no source code are available. My contribution is about explaining how it can be done in practice, and tell you about the DOs and DONTs, and my experiences.

What We Need to Implement

We will need to implement the ICorProfilerCallback interface. The CLR will call the interface methods, implemented by our Profiler, to inform about events.

Startup and Initialization

ICorProfilerCallback::Initialize

If the profiler DLL is loaded at application startup, ICorProfilerCallback::Initialize() will be called.

ICorProfilerCallback::InitializeForAttach

If we instead attach to a running process using the ICLRProfiling::AttachProfiler method, ICorProfilerCallback::InitializeForAttach() is called instead.

ICorProfilerInfo::SetEventMask

Probably the first thing one wants to do is to set up the event mask, which is how you tell the CLR what events you are interested in getting callbacks for. There is a whole lot of events to subscribe for.

MC++
COR_PRF_MONITOR_NONE              = 0,
COR_PRF_MONITOR_FUNCTION_UNLOADS  = 0x1,
COR_PRF_MONITOR_CLASS_LOADS       = 0x2,
COR_PRF_MONITOR_MODULE_LOADS      = 0x4,
COR_PRF_MONITOR_ASSEMBLY_LOADS    = 0x8,
COR_PRF_MONITOR_APPDOMAIN_LOADS   = 0x10,
COR_PRF_MONITOR_JIT_COMPILATION   = 0x20,
COR_PRF_MONITOR_EXCEPTIONS        = 0x40,
COR_PRF_MONITOR_GC                = 0x80,
COR_PRF_MONITOR_OBJECT_ALLOCATED  = 0x100,
COR_PRF_MONITOR_THREADS           = 0x200,
COR_PRF_MONITOR_REMOTING          = 0x400,
COR_PRF_MONITOR_CODE_TRANSITIONS  = 0x800,
COR_PRF_MONITOR_ENTERLEAVE        = 0x1000,
COR_PRF_MONITOR_CCW               = 0x2000,
COR_PRF_MONITOR_REMOTING_COOKIE   = 0x4000 | COR_PRF_MONITOR_REMOTING,
COR_PRF_MONITOR_REMOTING_ASYNC    = 0x8000 | COR_PRF_MONITOR_REMOTING,
COR_PRF_MONITOR_SUSPENDS          = 0x10000,
COR_PRF_MONITOR_CACHE_SEARCHES    = 0x20000,
COR_PRF_MONITOR_CLR_EXCEPTIONS    = 0x1000000,
COR_PRF_MONITOR_ALL               = 0x107ffff,
COR_PRF_ENABLE_REJIT              = 0x40000,
COR_PRF_ENABLE_INPROC_DEBUGGING   = 0x80000,
COR_PRF_ENABLE_JIT_MAPS           = 0x100000,
COR_PRF_DISABLE_INLINING          = 0x200000,
COR_PRF_DISABLE_OPTIMIZATIONS     = 0x400000,
COR_PRF_ENABLE_OBJECT_ALLOCATED   = 0x800000,
COR_PRF_ENABLE_FUNCTION_ARGS      = 0x2000000,
COR_PRF_ENABLE_FUNCTION_RETVAL    = 0x4000000,
COR_PRF_ENABLE_FRAME_INFO         = 0x8000000,
COR_PRF_ENABLE_STACK_SNAPSHOT     = 0x10000000,
COR_PRF_USE_PROFILE_IMAGES        = 0x20000000,

COR_PRF_MONITOR_THREADS

In my case, I needed COR_PRF_MONITOR_THREADS which give callbacks when threads are created and destroyed.

MC++
STDMETHODIMP ICorProfilerCallback::ThreadAssignedToOSThread(ThreadID managedThreadID, DWORD osThreadID) 
{
    m_managed2NativeMap[managedThreadID] = osThreadID;
    m_native2ManagedMap[osThreadID] = managedThreadID;
    return S_OK;
}

It is called when a native thread is assigned a managed thread ID. I use the ThreadAssignedToOSThread callback to maintain a mapping between the two IDs. The callback is only called once per thread. This is actually a problem when we choose to attach to running processes, because the callback will never come. I haven't found a good way to find out if a thread is managed or not. We can use the ICorDebug interface, but you cannot attach a debugger to yourself.

COR_PRF_ENABLE_STACK_SNAPSHOT

I also needed COR_PRF_ENABLE_STACK_SNAPSHOT, which permitted me to do the actual stack walk with ICorProfilerInfo2::DoStackSnapshot. It works kind of like StackWalk64, but instead of you calling StackWalk64 for each frame, you call DoStackSnapshot with a callback function pointer, and it calls your callback for each frame.

ICorProfilerInfo2::DoStackSnapshot

The ICorProfilerInfo2::DoStackSnapshot routine:

MC++
HRESULT DoStackSnapshot(
    [in] ThreadID thread,
    [in] StackSnapshotCallback *callback,
    [in] ULONG32 infoFlags,
    [in] void *clientData,
    [in, size_is(contextSize), length_is(contextSize)] BYTE context[],
    [in] ULONG32 contextSize);

A call to DoStackSnapshot needs a function call pointer and a register CONTEXT (EIP, EBP, ESP). When DoStackSnapshot is called, the framework calls the function callback for each Managed frame on the stack, and once for each native code block in between (may contain many native frames).

MC++
HRESULT __stdcall StackSnapshotCallback (
    [in] FunctionID funcId,
    [in] UINT_PTR ip,
    [in] COR_PRF_FRAME_INFO frameInfo,
    [in] ULONG32 contextSize,
    [in] BYTE context[],
    [in] void *clientData
);

A call on DoStackSnapShot will get us a call sequence as illustrated below.

Image 3

For all callbacks, we get a CPU CONTEXT (EIP, EBP, ESP) parameter, which represents the CPU state of the current frame.

When the frame is managed, we get also a FuncId, which we can resolve to a symbol name via the CLR.

If the frame is unmanaged, we can use the context parameter to traverse the callstack until the next managed frame. Unfortunately, we don't know where the next managed frame starts until we get the next callback. So, the best thing to do is to postpone the stackwalk until we know where it ends. In my implementation, I store the callback information FuncId and part of the CONTEXT in a list. Afterwards, when DoStackSnapshot returns, I iterate over the list expanding all unmanaged frames until they either end or we reach the next managed IP.

In order for a native stackwalk to succeed properly, and to get readable names, it is a must to have the symbol files (.pdb) loaded. This is done with a call to SymInitialize. The problem with SymInitialize is that it is not so fast. If the profiling starts right from the from process startup, the PDB files are not yet available, and SymGetModuleBase64 and SymGetSymFromAddr64 fail. They should start working when the PDB file eventually gets loaded.

Wrapping Up

Requirements for my Profiler

  • Should work on the v2.0 and v4.0 runtime
  • Needs to start and stop profiling
  • No need for registration

Forcing Apps into the v4.0 Runtime

The ICorProfiler interface has many good features in v4.0, not available in v2.0. In order to avoid the work of supporting both v2.0 and v4.0, I chose to concentrate on getting the support of v4.0 working. V2.0 apps would be supported by forcing them to run in the v4.0 runtime. There are very few compatibility issues between v2.0 and v4.0, so it should work 95% of the time.

There is support for this already in the CLR, through an environment variable called COMPLUS_version.

C:\>SET COMPLUS_version=v4.0.30319
C:\>MyApp.exe

The version string must of course match the one that you have installed.

Setting up a Timer to take Stacktraces

Since the plan was to do a manual stackwalk, I need a timer to be triggered at regular intervals. At the point of writing, it is set to every 40 ms. That is 25 times per second. To get a more accurate profiling, every 5ms or 10ms is better.

Starting and Stopping the Profiler

Since the DLL gets loaded into the process space of the target, we need to find a way to communicate with it. One nice solution would be to register a COM object in the Running Object Table (ROT), and then get that object from the application where I wished to control the profiler. I think ROT requires using the Registry, so I chose an approach using Pipes for communication instead.

There is an advantage of being able to communicate with the profiler. We can start the app with profiling on hold. Doing this permits us to get all the early events that are happening in the CLR which we may miss if we attach at a later point. One callback that particularly interests us is ThreadAssignedToOSThread.

In the InitializeAttach routine of the Profiler callback interface, I spawned a new thread, which I dedicate as a command listener. The pipe communication uses may look strange, it is because I experimented a little with asynchronous IO.

Demo Applications

There are in total four sub projects in the solution:

  • DiagProfiler.dll
  • DiagProfilerClient.dll
  • DiagProfilerConsole.exe
  • DiagProfilerLauncher.exe

DiagProfiler.dll

This is the profiler. It needs to be configured from the environment variables. You have to do the configuration in a Cmd console before starting the target app. There is a file called profile.bat that you can update in order to reflect your installation. Run it in the Cmd window, then start your app from the same Cmd window.

REM Microsoft CORProfilerAPI
SET COR_ENABLE_PROFILING=1
SET COR_PROFILER={C6DBEE4B-017D-43AC-8689-3B107A6104EF}
SET COR_PROFILER_PATH=C:\Source\MixedModeProfiler\MixedModeProfiler_demo\DiagProfiler.dll

REM CLR Option
SET COMPLUS_Version=v4.0.30319

REM DiagProfiler
SET DIAG_PRF_SYMBOLPATH=SRV*c:\symbols*http://msdl.microsoft.com/download/symbols;
SET DIAG_PRF_DEBUGTRACE=C:\temp\diag_debugtrace.txt
SET DIAG_PRF_STACKTRACE=C:\temp\diag_stacktrace.txt
SET DIAG_PRF_ONLY_MANAGED_THREADS=1

COR_PROFILER_PATH must have an absolute path. Relative paths don't work. If DIAG_PRF_STACKTRACE is omitted, the trace will be saved in the My Documents folder. If DIAG_PRF_SYMBOLPATH is omitted, the current path of the target application will be used.

DiagProfilerClient.dll

This is a managed library that we can use to get further control over the profiler.

MC++
public ref class ProfileClrApp
{
  bool StartProcess(String^ exePath, String^ stackOutputPath, String^ debugOutputPath);
  bool AttachProfiler(DWORD pId);
  bool ConnectGUI();
  bool StartSampling();
  bool StopSampling();
  int GetSampleCount();
  int GetSampleRate();
  bool SetMaxSamples(int maxNoSamples);
  bool SetSampleRate(int rateMs);
  bool SetStackLoggerFile(String^ stackOutputPath);
};

Profiler Launcher and Controller

I built a small GUI client app to get better control over the profiler. It supports start/stop, attaching, querying the current sample count, etc.

Image 4

The stack output file will be saved to My Documents of the current user. Double click on the field to change the file location.

DiagProfilerConsole.exe

A small console app that is using the DiagProfilerClient.dll library. The application accepts a PId as a parameter. Running it without parameters makes it start the sample app CppCliApp.exe.

C:\...>DiagProfilerConsole.exe 6292
v2.0.50727
v4.0.30319
AttachProfiler(6292, "C:\...\DiagProfiler.dll") : S_OK
Pipe = \\.\pipe\PipeServer_6292
Pipe = \\.\pipe\PipeServer_6292
Pipe = \\.\pipe\PipeServer_6292
Connected
Number of bytes read: 6
Message = [1][1]
Sampling Rate: 25
Number of bytes read: 6
Message = [2][1]
SetMaxSamples: 100
Number of bytes read: 6
Message = [3][1]
StackLogFile=C:\...\diagstack_6292.txt
Number of bytes read: 8
Message = [4][1]25
Samplerate: 25 ms
Number of bytes read: 6
Message = [5][1]
Started Sampling
Will run #100 samples
Number of bytes read: 6
Message = [6][1]
Stopped Sampling
Number of bytes read: 9
Message = [7][1]100
Got #100 samples
Press any key to continue . . .
Message = [MessageCount=7][CommandSucceded=1]100
"100" is the result from the executed server command.

Two files will be created in the current directory.
diagstack_#pid.txt and diagdebug_#pid.txt

Debugging

It is a bit cumbersome to debug an in-process profiler. One way is using OutputDebugString, printfs, and inserting "__debugbreak()" in the code.

The log files can optionally also echo to OutputDebug. Files are a bit awkward to work with when you have crashes, since the files are not flushed and closed. If you decide to see how it works, and run into problems, enable the debug output.

MC++
g_diagInit.m_debugLogger.Echo2OutputDebug(true);
g_diagInit.m_debugLoggerVerbose.Echo2OutputDebug(true);

DiagInit is a class that should be used to initialize the objects. Some of my global variables could probably be moved into that class.

Usually when creating a COM DLL, its class factory has to be written, and a lot of glue code. I had a strange error which I couldn't get rid of, so I decided to write my own class factory, instead of using ATL COM. That file is called DiagProfilerFactory. I made some sanity checks and logging there, but I later moved the code to DiagInit which is statically created. That is, even before DiagProfilerFactory, which is created dynamically.

Things Left to Do

Optimizations

There are quite a few optimizations to do. At this point, the symbol information for every call-frame is resolved, and the stacktrace is written to disk. This procedure is repeated, even if the same info has been resolved before. Here is a lot of time to save, caching the symbol information and mappings to managed functions. I would suggest building a cache, and storing only the IPs at every sample interval. When the profiling has finished, you go back to the sample data and resolve the symbol information.

Don't Profile all Threads

At some point, the profiler took samples of all threads belonging to a process. In a normal app, and particularly in a managed app, there are a lot of background threads, e.g., the garbage collector and finalizer threads. Usually the performance of those aren't interesting. Those are usually in a wait state (e.g., WaitForMultipleObjects) and will always give you the same stacktrace. These threads can be removed too, either by not profiling them or removing them from the trace afterwards. In my implementation, profiling from the application startup profiles only the threads that have been assigned a managed thread ID. With respect to profiling an existing process, it profiles all threads, because I cannot obtain the managed thread IDs.

Fix AttachProfiler with Managed Thread IDs

I tried using ICorDebug and to enumerate all managed threads. That part worked, but the ID of the thread was the Win32 thread ID, not the managed thread ID. The way to go is probably to enable more CLR callbacks, and call GetCurrentThreadId() on them. The problem with that approach is that SetEvent can only be called once, from Initialize or InitializeForAttach. Meaning that I cannot turn the events off again.

Points of Interest

Two interesting resources are Profiler Stack Walking in .NET Framework 2.0: Basics and Beyond and Building a mixed mode stack walker.

The managed-only profiler mentioned earlier Creating a Custom Net Profiler actually has a continuation by the same author, called Slimtune. It is also a mixed-mode profiler. I found out about Slimtune at a very late stage. It was a bit discouraging at first, but I decided to continue on my implementation anyway, because I didn't want to miss the learning experience. It is a more advanced version than mine (but you have to figure out for yourself how it works). Their implementation combines DoStackSnapShot and ENTER/LEAVE/TAIL hooks in a smart way. They also seem to track memory allocations, and garbage collection. If you are interested in learning mixed-mode stackwalking, my implementation is probably easier to understand because it is smaller, has source code comments, and has this accompanying article. It is also a direct implementation of the Profiler Stack Walking in .NET Framework 2.0: Basics and Beyond guideline on MSDN.

Another good source of Managed debugging, ICorDebug, and random .NET stuff is Mike Stall's blog.

License

This article, along with any associated source code and files, is licensed under The Microsoft Public License (Ms-PL)


Written By
Architect Visma Software AB
Sweden Sweden
Mattias works at Visma, a leading Nordic ERP solution provider. He has good knowledge in C++/.Net development, test tool development, and debugging. His great passion is memory dump analysis. He likes giving talks and courses.

Comments and Discussions

 
QuestionRe: Compilation issue in VS2012 Pin
vsselvaraj30-Jul-14 22:39
vsselvaraj30-Jul-14 22:39 
AnswerRe: Compilation issue in VS2012 Pin
Mattias Högström31-Jul-14 12:24
Mattias Högström31-Jul-14 12:24 
GeneralRe: Compilation issue in VS2012 Pin
vsselvaraj31-Jul-14 23:19
vsselvaraj31-Jul-14 23:19 
GeneralRe: Compilation issue in VS2012 Pin
Mattias Högström1-Aug-14 3:37
Mattias Högström1-Aug-14 3:37 
BugDoesn't work with WinRT application :( Pin
TomBoy20-Jan-13 3:30
TomBoy20-Jan-13 3:30 
GeneralRe: Doesn't work with WinRT application :( Pin
Mattias Högström9-Jan-14 8:24
Mattias Högström9-Jan-14 8:24 
QuestionDid not work for me:( Pin
eflatun076-Jan-13 22:46
eflatun076-Jan-13 22:46 
AnswerHow to debug the connection problem Pin
Mattias Högström9-Jan-14 8:09
Mattias Högström9-Jan-14 8:09 
Better late than never. I totally missed this question.
I just downloaded the demo zip and it ran fine on my computer.
A virtual machine with Windows 8 running. It is hard to say why it doesn't work for you.

What version of windows are you using?
What type of account, Administrator, Normal, restricted user.
Group policies? virus protection programs?

I cannot reproduce your error, but I have a guess.
The reason that it says GUI connection, or connection at all is how the debugging is performed. A DLL is loaded into the process space of the running process. This DLL handled the communication and listens for commands which we can send from the GUI. I would guess that the loaded DLL encounters an error and terminates/crashes the process, that would explain why the Pid is not present. To debug this. One has to debug both the DiagProfileLauncher and the process you want to profile.

When I debugged this. I used windbg, and set a breakpoint on the module load (the loading of DiagProfiler.dll). I don't think this is possible in Visual Studio. When the breakpoint was hit. I would set a new breakpoint in the initialisation routine where the connection is made and then single step from there. I know. It is a nightmare to debug this way, but that was the way I debugged it.
QuestionExcellent job Pin
Mike Hankey16-Jul-12 6:18
mveMike Hankey16-Jul-12 6:18 
AnswerRe: Excellent job Pin
Mattias Högström16-Jul-12 10:53
Mattias Högström16-Jul-12 10:53 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.