SD, SPI and Interrupts

Status
Not open for further replies.

microderm

Well-known member
I'm using Teensy with various devices including an accelerometer (SPI1), an OLED (SPI0), GPS (USART) and the SD card for data logging. The accelerometer is read using an IntervalTimer interrupt to get frequent, regular samples. It all works nicely when not logging to SD. When logging is enabled, it sometimes crashes.

The IntervalTimer interrupt simply reads the accelerometer values via SPI1 and stores them in an array.

I also have an interrupt on pin 2 from the GPS PPS pulse to synchronise timing between GPS and millis().

I am not currently using the SPI.usingInterrupt(interrupt), as the interrupt is an IntervalTimer, so I'm not sure what interrupt number that corresponds to.

I am using volatile variables and noInterrupts() etc. according to good practices.

How should I go about making this system robust when recording to SD?
 
What does your interrupt function actually do? Call other functions?

How should I go about making this system robust when recording to SD?

Best to avoid interrupts if you can. They've very tricky to use properly. I know you say you're "using volatile variables and noInterrupts() etc. according to good practices", but when we hear about problems where a program mysteriously crashes, it's almost always unsafe stuff from interrupts (using nearly any library from interrupts is unsafe) or a buffer overflow.
 
What does your interrupt function actually do? Call other functions?

Yes, one of the interrupts calls a function in the main .ino file.

The GPS module attempts to allow Teensy to get accurate timing by finding the relationship between the system mills() and GPS UTC time using the PPS second pulse. When a pulse is received the UTC timestamp is calculated and compared to millis() to find the offset (this value slowly drifts with time). When the value changes, a request is sent to the accelerometer to adjust its timing.

1) Time synch interrupt:
Code:
/******************** GPSModel header ********************/

void (*callback)(int); // Function: synchEvent() of main .ino file //

volatile int time_ms_offset; // Set within interrupt //
volatile int GPRMC_time; // Set outside interrupt //
volatile bool synchronised; // Set within interrupt //

GPSModel* GPSModel::self = NULL; // Self reference to instance //

/******************** GPSModel cpp ********************/

/*
 * Initialises the pps interrupt.
 */
void GPSModel::initPPS() {
    // Set up synchronisation interrupt //
    synchronised = false; // Interrupt not yet started - safe! //
    pinMode(pps_pin, INPUT); // GPS time pulse //
    attachInterrupt(digitalPinToInterrupt(pps_pin), synchTime, RISING); // Using default priority - don't know how to set to higher priority! //
    active_pps = true;
}

/*
 * Static time synchronisation interrupt function. Uses the latest timestamp to calculate the real time to system time offset.
 */
void GPSModel::synchTime() {
    int time = ((self->GPRMC_time + 500) / 1000) * 1000; // Round to nearest second //
    int offset = time - millis();
    int adjust = self->time_ms_offset - offset;
    self->time_ms_offset = offset;
    if (adjust) {
        if (!self->synchronised) {
            adjust = 0; // First time: zero value //
            self->synchronised = true;
        }
        self->callback(adjust); // Request a shift in accelerometer timing //
    }
    self->pulse_pps = true;
    //Serial.println("\n*** pulse *** ms: " + String(time) + " (" + String(self->time_ms_offset) + ")\n");
}

/*
 * Returns the current synchronisation offset from system time to UTC time.
 * @return the synchronisation offset in milliseconds.
 */
int GPSModel::getSynchOffset() {
    noInterrupts();
    int offset = time_ms_offset; // Copy data: don't allow modification while copying //
    interrupts();
    return offset;
}

/******************** Main .ino file ********************/

volatile bool volatile_sync = false; // PPS time synchronisation is complete //
volatile int adjustment = 0; // Millisecond shift in GPS to clock timing //

/*
 * Volatile callback function for time synchronisation success.
 * @param ms the millisecond adjustment required to accelerometer timing.
 */
void synchEvent(int ms) {
    Serial.println("*** Synch Event! ***: " + String(ms));
    if (!ms) {
        volatile_sync = true;
    } else {
        adjustment = ms;
    }
}

/*
 * Respond to volatile data changes outside the interrupt.
 */
void pollCheck() {
    int adjust = 0;
    
    noInterrupts();
    adjust = adjustment; // Un-interrupted copy //
    adjustment = 0; // Clear volatile data //
    interrupts();
    
    if (adjust) {
        accel.adjust(adjust); // Adjust accelerometer timing //
    }
    temp_adjust = adjust;
}

I don't suspect this code of any wrong doing as this always runs while not recording to SD. Never seen a crash while not recording to SD. I have seen various warnings in threads about calling functions and using Serial.print though, oops!
 
What does your interrupt function actually do? Call other functions?

The accelerometer uses an IntervalTimer to get regular readings. It does not call any other functions, but I am not using the SPI.usingInterrupt(interrupt) code, as I don't know how to get the interrupt number from the timer.

2) Accelerometer timer interrupt:
Code:
/******************** Accelerometer header ********************/

IntervalTimer timer;

volatile bool changed;
volatile int adjustment;
volatile unsigned int index;
volatile unsigned int time_array[MAX_REC_COUNT];
volatile float data_array[AXIS_COUNT][MAX_REC_COUNT];

/******************** Accelerometer cpp ********************/

Accelerometer* Accelerometer::self = NULL; // Self reference to instance //

/*
 * Starts the sensor timer.
 * @param timestamp the UTC timestamp reference.
 */
void Accelerometer::start() {
    if (!active) {
        noInterrupts();
        
        adjustment = 0;
        changed = false;
        callback();
        timer.begin(callback, 20000); // Start callbacks //
        
        interrupts();
    }
    active = true;
}

/*
 * Copies accumulated data and returns the number of new data items.
 * @return the number of new data items.
 */
int Accelerometer::update(float* data, unsigned int* times) {
    int counter;
    noInterrupts();
    // Copy data //
    memcpy((float*)data, (float*)data_array, DATA_BYTES);
    memcpy((unsigned int*)times, (unsigned int*)time_array, TIME_BYTES);
    counter = index; // Un-interrupted copy //
    index = 0; // Reset the index //
    interrupts();
    return counter;
}

/*
 * Adjusts the sensor timing by the given number of milliseconds.
 * @param ms the number of milliseconds to adjust the timer.
 */
void Accelerometer::adjust(int ms) {
    noInterrupts();
    adjustment = ms; // Un-interrupted copy //
    interrupts();
}

/*
 * The static timer callback function. [∆t: 16 µs]
 */
void Accelerometer::callback() {
    // Only read data when storage available //
    if (self->index < MAX_REC_COUNT) {
        (self->sensor)->getAccelData(self->acc_array); // Using SPI1 to talk to the sensor //
        self->data_array[AXIS_X][self->index] = self->acc_array[AXIS_X];
        self->data_array[AXIS_Y][self->index] = self->acc_array[AXIS_Y];
        self->data_array[AXIS_Z][self->index] = self->acc_array[AXIS_Z];
        self->time_array[self->index] = millis();
        self->index++;
    } else {
        //Serial.println("*** Accelerometer Data Overflow! ***");
    }
    if (self->adjustment) {
        // Make timing adjustment //
        self->timer.update((20000 + self->adjustment) * 1000); // Period in microseconds //
        self->changed = true;
    } else if (self->changed) {
        // Restore regular timing //
        self->timer.update(20000 * 1000); // Period in microseconds //
        self->changed = false;
    }
}

/******************** Sensor cpp ********************/

/*
 * Reads the multi-axis acceleration.
 * @param output the array to populate.
 */
void Sensor::getAccelData(float output[3]) {
    byte data[] = {0, 0, 0, 0, 0, 0, 0, 0, 0};
    readRegisters(AXIS_X, 9, data);
    // Set 20-bit data //
    output[0] = normalise((data[0] << 12) | (data[1] << 4) | (data[2] >> 4), 20) * SCALAR_X;
    output[1] = normalise((data[3] << 12) | (data[4] << 4) | (data[5] >> 4), 20) * SCALAR_Y;
    output[2] = normalise((data[6] << 12) | (data[7] << 4) | (data[8] >> 4), 20) * SCALAR_Z;
}

/*
 * Reads multiple registers into the given array starting at the given address for the given count.
 * @param address the register start address.
 * @param count the number of registers to read.
 * @param data the array to populate.
 */
void Sensor::readRegisters(int address, int count, byte* data) {
    byte addr;
    spi->beginTransaction(settings);
    digitalWriteFast(CS_PIN, LOW);
    addr = (address << 1) | READ_BYTE;
    spi->transfer(addr);
    for (int i = 0; i < count; i++) {
        data[i] = spi->transfer(0x00);
        address++;
    }
    digitalWriteFast(CS_PIN, HIGH);
    spi->endTransaction();
}
 
How do you SD logging?

An SD file is opened and remains open until recording is stopped. All data is carefully parsed into char arrays of fixed size and inserted into a large char array. The char array is written to the SD card. The data lines are always exactly the same length.

The accelerometer data is stored sequentially in an array. Checks prevent storage beyond the array's size. Excess data will be discarded. The main app copies the data into its own array and processes it. The copy process is protected by noInterrupt().

I see write times of about 40us, but sometimes (once every few minutes) they rise to 7000 or even 13000us (13ms) for one line only and then back to normal duration again. I presume that this is normal behaviour?

Code:
/******************** Main .ino ********************/

#include <SdFat.h>

SdFatSdioEX sdfat;
FatFile* fat; // The directory file //
File f; // The working storage or playback file //

const int MAX_LINE_SIZE = 340; // Storing lots of data! //
char string_buff[MAX_LINE_SIZE]; // SD data string //

/*
 * Opens the current file of the SD card for writing.
 * @param filename the name of the file to open.
 * @return the success of the operation.
 */
bool openFile(const char* filename) {
    if (!initialised) {
        Serial.println("[NOT initialised!]");
    } else if (!f.open(filename, O_RDWR | O_CREAT)) {
        sdfat.errorHalt("[SD open failed!]");
    } else {
        return true; // Success //
    }
    return false;
}

/*
 * Composes the data line to write.
 */
void writeFileData() {
    accel_count = accel.update(&data_array[0][0], &time_array[0]); // Receives the batch sensor data //

    int iter = 0; // Index of next free buffer slot //
    /*
     Insert tab separated data items into string buffer, e.g.
     */
    Utilities::insertPadFloat(string_buff, MAX_LINE_SIZE, iter, v.i, 3, 3, ' '); // Accel x-coordinate format: nnn.ddd //
    string_buff[iter++] = '\t'; // Add tab //
    Utilities::insertPadFloat(string_buff, MAX_LINE_SIZE, iter, v.j, 3, 3, ' '); // Accel y-coordinate format: nnn.ddd //
    string_buff[iter++] = '\t'; // Add tab //
    
    iter--; // Remove final tab //
    string_buff[iter] = '\0'; // Insert end of string //
    writeCharLine(string_buff, iter); // Write buffer to file //
}

/*
 * Appends the given line of text to the open SD card file (appends <CR><NL>).
 * @param line the line of text to write.
 * @param count the number of charcters to write.
 * @return the success of the write operation.
 */
bool writeCharLine(char* line, int count) {
    printed = f.println(line);
    char_count = count + NL_LEN; // Account for the line ending //
    if (printed == char_count) {
        return true;
    } else {
        Serial.println(F("[Print Data Error!]"));
        Serial.println("char_count: " + String(char_count));
        Serial.println("printed: " + String(printed));
    }
    return false;
}
 
I hope the code snippets I've posted succinctly describe what I'm trying to do and how I'm going about it?

From here, I would like to understand in what ways interrupts will cause problems. For example, I read that calling functions from the ISR is a bad idea, but why? Is it just that it is slow to do so and ISRs should be as fast as possible, or is there a more sinister complication associated with stack usage that could, in itself result in a crash?

The GPS pin pulse interrupt (1) above calls a function in the main .ino script. Does this count as calling a library? If this is dangerous to do, can I fix it by having the main script poll the the GPS instance and ask whether the time offset has changed instead?
 
Not knowing what insertPadFloat does with the MAX_LINE_SIZE, but assuming it checks for overflow, maybe your calls to

Code:
Utilities::insertPadFloat(string_buff, MAX_LINE_SIZE, iter, v.j, 3, 3, ' ');

should be 

Utilities::insertPadFloat(string_buff, MAX_LINE_SIZE - 1, iter, v.j, 3, 3, ' ');

instead as you add a tab on the end, remove it and add a zero without checking for buffer overflow.
 
I hope the code snippets I've posted succinctly describe what I'm trying to do and how I'm going about it?
What I really mean is where do you write to SD (from the main loop, from a routine that is called from an interrupt)?

You can call any routine from an interrupt, as long as you make sure that
- you do not spend too much time in the function.
- or interrupt levels are organized such that every interrupt is served and every function gets its share of time to be executed.

So, if you only access the SD card from main loop(), which has lowest priority, then SD card cannot block regular ISRs.

SD cards can block the function up to over 100 ms.
Also IMO you have far to inhibitions of interrupts. Ge rid of all of them. I never use them in my programs, and logging high speed acquisition (>2MB/s) is not an issue
 
I see write times of about 40us, but sometimes (once every few minutes) they rise to 7000 or even 13000us (13ms) for one line only and then back to normal duration again. I presume that this is normal behaviour?

Yes. Writes to SD cards do that. Made worse by using a FAT file system. Allocating a new cluster triggers a scan of the FAT (1 or more reads) and update. Which takes time. Dependent on the state of the file system.
 
You're using some library named "Accelerometer" which accesses hardware in some way we can't see, *and* does callbacks, and some C++ class called "Utilities", which does who knows what.... and you're throwing interrupts into the mix to cause everything to happen.

Have you carefully analyzed and vetted every part of Accelerometer and Utilities for interrupt safety? I also see usage of Arduino's String class in one of those code fragments. String is *definitely* not interrupt safe!

Seems pretty clear you've already invested a lot of time & effort with interrupts, so unlikely you'll want to abandon that path. That sort of thinking just makes it all the more painful in the end. This sort of approach almost never works out well. You really should avoiding using interrupts so much to trigger main program logic. I try to warn of this, even in red text, on all the PJRC pages where we document interrupt features. Interrupts are very powerful, but very hard to use and best to keep them limited to only fairly simple hardware access.
 
From here, I would like to understand in what ways interrupts will cause problems. For example, I read that calling functions from the ISR is a bad idea, but why? Is it just that it is slow to do so and ISRs should be as fast as possible, or is there a more sinister complication associated with stack usage that could, in itself result in a crash?

It is about speed. Ideally interrupt service should be fast so that it doesn't hold up the rest of the code. As for calling other functions, the only problem I have noticed is the C compiler getting stupid about saving registers and wasting time and stack space. Not a problem unless you are really pushing the limits.

You do not want to call the SD code from within an interrupt. There is a very good chance that the call will not complete before the next interrupt. Stash the data in a buffer and write full buffers. Here is how I did that for an old SparkFun product.
 
Not knowing what insertPadFloat does with the MAX_LINE_SIZE, but assuming it checks for overflow, maybe your calls to

Code:
Utilities::insertPadFloat(string_buff, MAX_LINE_SIZE, iter, v.j, 3, 3, ' ');

should be 

Utilities::insertPadFloat(string_buff, MAX_LINE_SIZE - 1, iter, v.j, 3, 3, ' ');

instead as you add a tab on the end, remove it and add a zero without checking for buffer overflow.

Yes, it's always sensible to question the simple things first before digging deep, but here my utility function does the appropriate checks so as not to overrun the buffer (I checked when writing it). Even if it didn't, I make the buffer larger than needed anyway, as it costs nothing to do so. Besides I never had crashing until I introduced the interrupts into the system.
 
Have you carefully analyzed and vetted every part of Accelerometer and Utilities for interrupt safety?
Yes, I believe I have. Utilities is just for conversion functions etc.

I also see usage of Arduino's String class in one of those code fragments. String is *definitely* not interrupt safe!
Yes, the use of String here is just for debug. When commented out, it crashes just the same.

Seems pretty clear you've already invested a lot of time & effort with interrupts, so unlikely you'll want to abandon that path.
Yes, that's why it's time to either confirm an approach as ok, as potentially fixable (given greater knowledge), or abandon the idea in favour of an alternative approach. Ultimately, I wan't to build a system that works well and always works. If my current approach cannot work, then it must be changed.

That said, I can't help but feel that most of you would choose to use interrupt for the following brief:

1) Calculate the time difference in milliseconds between the latest GPS UTC timestamp and system clock milliseconds when the synchronisation pulse arrives from the GPS chip.

To me the looks like the classic case for when you'd really want to use interrupts! The way I implemented it is shown at the top of the page.

The ISR is triggered once per second by the time synch pulse from the GPS chip. The ISR calculates the current UTC time to the nearest second and calculates the offset to millis(). The ISR then compares the offset to the previous figure to see whether it has changed (drifted by 1 millisecond). If it has, it calls a function in the main .ino script that sets a variable so that the accelerometer periodicity can be adjusted at some point when convenient.

You really should avoiding using interrupts so much to trigger main program logic.
Does the description above qualify? The main program logic is called about once every few minutes and simply sets the value of one variable. Alternatively, I could have main program logic poll the GPS instance for when drift has occurred, but that would result in thousands of calls before a positive case were encountered. Would that be better?
 
(2) The accelerometer interrupt:

It is about speed. Ideally interrupt service should be fast so that it doesn't hold up the rest of the code. As for calling other functions, the only problem I have noticed is the C compiler getting stupid about saving registers and wasting time and stack space. Not a problem unless you are really pushing the limits.
I'm not pushing the limits. I designed the system to have idle time, as sometimes it needs it.

You do not want to call the SD code from within an interrupt. There is a very good chance that the call will not complete before the next interrupt.
I call the SD code from the main loop, not in response to an interrupt.

Stash the data in a buffer and write full buffers. Here is how I did that for an old SparkFun product.
I believe I'm doing it the same way. The interrupt reads the sensor via SPI1 and stores the data at the next free place in an array. If the array is full I don't read the data. The main program loop calls a function in the Accelerometer instance to safely copy the array and index data, before resetting the index so that new data can be stored. I manually measured the ISR call duration. It lasts no longer than 1us.

It sounds like you (UhClem) have successfully implemented the approach I'm taking. Does this ever conflict with the SD card? If not, then I wonder whether my problem lies in the choice of which SPI channel to use. I understand that the SD card uses SPI? If so, which channel is it using (SPI, SPI1, SPI2 or an internal one)? Am I clashing by trying to share the same channel (data lines)? This was my original fear that prompted me to post...
 
It sounds like you (UhClem) have successfully implemented the approach I'm taking. Does this ever conflict with the SD card? If not, then I wonder whether my problem lies in the choice of which SPI channel to use. I understand that the SD card uses SPI? If so, which channel is it using (SPI, SPI1, SPI2 or an internal one)? Am I clashing by trying to share the same channel (data lines)? This was my original fear that prompted me to post...

You are doing it differently. I have a group of buffers. When one gets full, data goes into the next one. The main loop watches for a buffer to be full. When that happens, it writes the data to the SD card and marks the buffer as empty. The data gathering process never pauses because there are always plenty of empty buffers for data. Even when the actual SD write is slow.

Depending on the SD library and hardware you are using, the SD card can use a dedicated high speed 4 bit wide interface completely independent of SPI. (3.5 and 3.6)
 
You are doing it differently. I have a group of buffers. When one gets full, data goes into the next one.
Ok, I use one buffer. It never fills up as it is emptied long before that eventuality.

I tested my theory of SPI conflict when reading the sensor data. I commented out the sensor read and filled the array with fake data. The system still crashed. I was surprised, you probably weren't!

Next, I'll try removing code from the ISRs so that they do nothing (once I've figured out how to fake the data flow),...
 
Status
Not open for further replies.
Back
Top