Safe Implementation of Custom Fault Handler with System Shutdown - Teensy 4.1

ibrahim-mohsen

New member
I'm implementing a custom fault handler for a robotics project that needs to safely shut down motors before executing the default fault handler. Here's my implementation:

1. I'm overriding fault vectors (3-6) in _VectorsRam for:
- Hard Fault
- Memory Management Fault
- Bus Fault
- Usage Fault
- the system_shutdown() function may be called from a custom assert function as well


2. My custom fault handler:
- Disables interrupts
- Executes safe shutdown (stops motors)
- Calls original unused_interrupt_vector()

Key questions:
1. Is this a safe approach for fault handling?
2. Will USB programming capability be maintained?
3. Are there any risks in calling system shutdown function from a fault handler?


#include <RoboClaw.h>
#include "kill_switch.h"
#include <Arduino.h>
#include <interrupt_config.h>
#include <interrupt_pins.h>
#include "led.h"
#include <sumo_assert.h>
#include <imxrt.h>
extern void yeild(void);
extern void unused_interrupt_vector(void);
extern void (*volatile _VectorsRam[NVIC_NUM_INTERRUPTS + 16])(void);
RoboClaw *globalRoboClaw = nullptr;
LED *globalLED = nullptr;
void kill_switch_init(RoboClaw *roboClaw, LED &led)
{
_VectorsRam[3] = &FaultISR; // Hard Fault
_VectorsRam[4] = &FaultISR; // Memory Management Fault
_VectorsRam[5] = &FaultISR; // Bus Fault
_VectorsRam[6] = &FaultISR; // Usage Fault
pinMode(KILL_SWITCH_PIN,INPUT);
attachInterrupt(digitalPinToInterrupt(KILL_SWITCH_PIN), kill_switch_isr, FALLING);
sumo_assert(roboClaw != nullptr, "RoboClaw pointer is null in kill_switch_init");
sumo_assert(&led != nullptr, "LED reference is null in kill_switch_init");
globalRoboClaw = roboClaw;
globalLED = &led;
}
extern "C"
{ // ensure the ISR has C linkage as it will be called from C code
void kill_switch_isr()
{
// Code to safely stop the robot
system_shutdown(NO_ERROR, "Remote Stop Triggered");
}
}
void system_shutdown(enum error_code code, const char *message, bool halt = true)
{
// Code to safely shut down the system
globalRoboClaw->ForwardM1(0, 0); // Stop motor 1
globalRoboClaw->ForwardM2(0, 0); // Stop motor 2
switch (code)
{
case EMERGENCY_STOP:
globalLED->setColor(Status, red);
globalLED->setColor(Strategy, off);
break;
case SYSTEM_FAILURE:
globalLED->setColor(Status, blue);
globalLED->setColor(Strategy, off);
break;
case NO_ERROR:
globalLED->setColor(Status, off);
globalLED->setColor(Strategy, off);
break;
default:
break;
}
while (halt)
{
delay(1000);
if (message != nullptr)
Serial.println(message);
yeild();
}
}
extern "C"
{
void FaultISR(void)
{
// Disable interrupts first
__disable_irq();
// Call system shutdown
system_shutdown(SYSTEM_FAILURE, "System Fault Detected", false);
// Then call the default handler for proper system handling and progrmming
unused_interrupt_vector();
// Should never reach here
while (1)
{
}
}
}
 
If it's a safety issue I wouldn't trust the CPU to be able to reliably do anything after one of those faults occurred. I would use the external watchdog monitor to raise a signal that could be handled by an external shutdown circuit.

(Note that unused_interrupt_vector() will eventually automatically restart the Teensy, which means it can potentially re-trigger the same fault infinitely.)
 
Software would not cover the case where the CPU just stop, due to hardware reasons (clocks, power supply, .....)
 
2. My custom fault handler:
- Disables interrupts

Key questions:
2. Will USB programming capability be maintained?
WIthout interrupts it seems USB would be disabled.

RE key questions #1 and #3:: PJRC implemented CrashReport code that might be a guide to replace? It captures faults where possible and safely makes a record and provides code space for essential tasks - then waits 8 seconds and restarts,

Even so as noted the MCU restarts - perhaps look into setting the MCU into OFF mode after assuring an external watchdog and shutdown circuit is in place as needed - or assure that restarting the MCU is the right thing to control the situation. Or quickly log the fault and restart ASAP than could be some tenths of seconds to regain a fresh start. In startup.c there are startup_default_??_hooks for undelayed attention as the MCU restarts and comes back to function - NOTE: some are online before certain code can execute.
 
Back
Top