Debugging Software Crashes IIThis article continues our discussion on debugging software crashes. Here we focus on memory corruption crash symptoms. We will also look at the special considerations in debugging C++ code crashes. Finally we will look at techniques to simplify crash debugging. Debugging Memory CorruptionPrograms store data in any of the following ways:
Memory corruption in the global area, stack or the heap can have confusing symptoms. These symptoms are explored here. Global Memory CorruptionIf a global data location is found to be corrupted, there is good chance that this is caused by array index overflow from the previous global data declarations. Also the corruption might have been caused by an array index underflow (array accessed with a negative index) from the next variable declarations. The following rules should be helpful in debugging this condition:
Heap Memory CorruptionCorruption on the heap can be very hard to detect. A heap corruption could lead to a crash in heap management primitives that are invoked by memory management functions like malloc and free. It might be very hard to detect the original source of corruption as the buffer that lead to corruption of adjacent buffers might have long been freed. Guidelines for debugging crashes in heap area are:
Stack Memory CorruptionStack corruption by far produces the most varied symptoms. Modern programming languages use the stack for a large number of operations like maintaining local variables, function parameter passing, function return address management. See the article on c to assembly translation for details.Here are the rules for debugging stack corruption:
Crash Debugging in C++We have been discussing crash debugging techniques that apply equally well to C as well as C++. This section covers crash debugging techniques that are specific to C++. Invalid Object PointerMany C++ developers get confused by crashes that involve method invocation on a corrupted pointer. Developers need to realize that invoking a method for an illegal object pointer is equivalent to passing an illegal pointer to a function. A crash would result when any member variable is accessed in the called method. In the example given below, when HandleMsg() is invoked for a NULL pX, the crash will result only when an access is attempted to member variables of X. There will be no problem in calling PrepareForMessage() or HandleYMsg() for Y pointer. (For more details on this refer to C and C++ article.
V-Table Pointer Corruption
All classes with virtual functions have a pointer to the V-table corresponding to overrides for that class. The V-table pointer is generally stored just after the elements of the base class. Corruption of the v-table pointer can baffle developers as the real problem often gets hidden by the symptoms of the crash. The figure above shows the declaration of class A and B. The figure below shows the memory layout for an object of class B. If m_array array is indexed with an index exceeding its size, the first variable to be corrupted will be the v-table pointer. This problem will manifest as a crash on invoking method SendCommand. The reason this happens is that SendCommand is a virtual function, so the real access will be using a virtual table. If the virtual table pointer is corrupted, calling this function will take you to never-never land.
For more details on v-table organization refer to C and C++ Comparison II article. Dynamic Memory AllocationMany C++ programs involve a lot of dynamic memory allocation by new. Many C++ crashes can be attributed to not checking for memory allocation failure. In C++ this can be achieved in two ways:
Simplifying Crash DebuggingHere are a few simple techniques for simplifying crash debugging: Obtaining Stack DumpMake sure that every embedded processor in the system supports dumping of the stack at the time of crash. The crash dump should be saved in non volatile memory so that it can be retrieved by tools on processor reboot. In fact attempt should be made to save as much as possible of processor state and key data structures at the time of crash. Using assertAn ounce of prevention is better than a pound of cure. Detecting crash causing conditions by using assert macro can be a very useful tool in detecting problems much before they lead to a crash. Basically assert macros check for a condition which the function assumes to be true. For example, the code below shows an assertion which checks that the message to be processed is non NULL. During initial debugging of the system this assert condition might help you detect the condition, before it leads to a crash.Note that asserts do not have any overhead in the shipped system as in the release builds asserts are defined to a NULL macro, effectively removing all the assert conditions.
Defensive Checks and TracingSimilar to asserts, use of defensive checks can many times save the system from a crash even when an invalid condition is detected. The main difference here is that unlike asserts, defensive checks remain in the shipped system. Tracing and maintaining event history can also be very useful in debugging crashes in the early phase of development. However tracing of limited use in debugging systems when the system has been shipped. |
|