Tuesday, March 5, 2013

Hard Fault Debugging for Cortex-M MCUs

______________________________________________

1. Why HARD_FAULT exception raises?
    1.1. Uncontrolled/unexpected memory accesses.
    1.2. Accesses to unclocked peripherals.
    1.3. Executing Flash commands from code in Flash
2. Finding HARD_FAULT exception.
    2.1 General notes about the code
    2.2 Take special care with....
    2.3 The code
______________________________________________


In this post we are explaining how to debug HARD_FAULT exceptions on Cortex-M processors as well as locating the origin of the exception condition.

There are lots of sites describing this problem, and most of them show exactly the same code.

Of course, the code did not properly compiled for my target processor (Kinetis / Cortex-M4), so I made some small changes in order to get it working.

Finally I have modified the code to compile it using GNU GCC ARM Toolchain for many Cortex-M targets and thus, the code should properly compile using any GCC compatible toolchain (or it is supposed to do it properly).

Before going into the code, let's just say a few words about HARD_FAULT exception causes


1. Why HARD_FAULT exception raises?

Most commom causes raising HARD_FAULT exceptions are the following::

1.1. Uncontrolled/unexpected memory accesses.

This happens due to programming errors that result in uncontrolled accesses to restricted or forbidden memory locations.

On the best situation, HARD_FAULT exception is due to stupid errors which are not noticed by compilation process: like miss-typing the name of variables with simmilar names (i.e. exchanging 'u8aux' and 'u8idx' when both variables are declared in the same context), by array indexes going out of bounds, and things like those....

On the worst case, the exception may be raised due to some issues harder to debug, like wrong return from function or ISR due to stack corruption, etc.

1.2. Accesses to unclocked peripherals.

Some MCUs control clock distribution to each system module or peripheral to reduce power consumption to the bare modules in execution.

In these kind of MCU's you must enable clocking into the peripheral module prior to accessing any peripheral register or be pretty sure that HARD_FAULT exception will raise.

1.3. Executing Flash commands from code in Flash

When the MCU does not support "read while write" flash feature, HARD_FAULT exception may rise when executing flash commands (like sector erasing or writing) from code stored in flash and apparently not affected by the flash operation.

The easiest way to solve this problem is running flash command code from code stored into RAM memory instead of flash. To d this, just set up your linker configuration file to load flash driver code into RAM on startup. (Visit this post to get an example of this issue).


2. Finding HARD_FAULT exception.

The main target of the code shown below is recovering the processor execution context that was present when the exception raised (program counter, stack pointers, register contents, etc.)

In order to understand the code below, we must consider that Cortex-M MCUs process HARD_FAULT exception like a higher priority interrupt, and so, they store the processor execution context into the stack before jumping to HARD_FAUL handler vector. This way, we can extract the execution context that raised the exception from the stack.

The code below declares a HARD_FAULT exception handler "hardfaultHandler" that determines which stack PSP or MSP was active when the exception happened and calls another function "hardfaultGetContext" that extracts the stacked context into local variables.

2.1 General notes about the code

The code shown below has been compiled using GNU GCC ARM Toolchain running from Eclipse IDE.
It successfully compiles without errors targeting the following Cortex-M processors:
  • Cortex-M0
  • Cortex-M0+
  • Cortex-M1
  • Cortex-M3
  • Cortex-M4 (without floating point processor)
  • Cortex-M4 (with software floating point)
  • Cortex-M4 (with hardware floating point processor)

2.2 Take special care with....

There is a special issue included in the code below that I have not found in other blogs nor websites and that took me a couple of painfull days to go through it.

Most hardfault code published in Internet calls hardfaultGetContext function from inline assembler (using unconditional branch to label BL instruction). The following code shows an example of wrong code found on Internet...



By default, the code above issues an assembler error because the assembler does not know the symbol or label  "hardfaultGetContext" nor "_hardfaultGetContext".

To avoid the error you must assign an assembler label to the entry point of "hardfaultGetContext" function and branch to that label from inline assembly. In the following code snippet you can see how to assign an assembler label to a C function...



2.3 The code

And finally here you are the complete code... enjoy it!




1 comment:

  1. Thank you ! Very valuable - I just had an issue with hard fault exception, and no clue what was happening. It resolved by itself (wtf?!? - via cold reset), but if I have it again - your code will help me.

    ReplyDelete