Post-mortem debugging

Post-mortem debugging is in some ways a very special kind of debugging. This is when a process or system has crashed, and you're left with a dump file of sorts to figure out what went wrong.

I won't get into particulars too much here, but just give a quick introduction.

Wikipedia discusses post-mortem debugging in its Core dump page. Kernel dumps have their own twists, but I'll start with user-mode dumps first, which I'll refer to as 'minidumps'.

General flow of post-mortem debugging

The first thing you want to do when you get a minidump is make sure that you have good symbols. At the very least, you'll want symbols for your operating system. For Microsoft Windows, you can set up Visual Studio to load them up by checking the Microsoft Symbol Servers location in the proper Options page. If WinDBG is your jam, that's also well supported.

Without operating system symbols, a bunch of things will be suspect. The debugger may not know how to properly understand process and thread-level information, and while it may guess, there's really no reason to risk getting any of that wrong.

After getting symbols, you'll want to make sure you can see what triggered the minidump. "It crashed" is more a description of the outcome than the cause. Reasons for "crashing" and triggering a minidump might include an access violation, an uncaught C++ exception, a triggered assertion, an explicit call, or even something as simple as taking too long to become responsive after app launch.

In WinDBG, a couple of useful for commands are !analyze and .ecxr. The former might take a bit of time to run but will produce a report describing some basics of what went wrong - the trigger, faulting instruction if available, call stack, etc. The latter will set the debugger register context to the context record in the exception, allowing you to do get the callstack and registers at the point of failure.

In the future I hope to write about some of the following.

Happy debugging!

Tags:  debuggingdesign

Home