Fast Data Logger

But first I have to remember where I put the calculator app on this computer!
I know the problem!
If it helps anyone during these winter months ( and in my case year wide ), I have been suffering from "Brain Fog". It is/was terrible. I could not work on anything. The problem was very low Vitamin D levels. I now take Vit D pills daily and the problem has gone away. The difference is like night and day.
 
I bought a high card SD Card 0 Delkin V90. Rated at 300MB/sec. I assume that is using it's UHS-II interface. But I figured it should have lower latency. I tested it with joepasquariello's latency test firmware for 1.2 hour at 18MB/sec and sure enough:

Delkin UHS-II 256GB V90
18MB/sec 30 minutes
66355199 writes in 1791.591 s (309.518 s writing to file = 17.276 %)
File is full
fileSize = 33973861888 before sync()
rb.bytesUsed = 512 before sync()
fileSize = 33973862400 after sync()
rb.bytesUsed = 0 after sync()
rbMaxUsed = 193536
avg write us = 4.66
min write us = 4.58
max write us = 11.49
avg busy us = 18.18
min busy us = 13.99
max busy us = 7472.60
file.close()

Very low latency.

And with my logger firmware - maximum buffer usage is 150528, which is much better than the other cards. That works out to 8400 usecs, which sounds right.

I have also taken data and I consistently get about 6% errors (even with the new card).

Usually the error is a 32 bit value and is set to the 32 bit value that comes next. This is very strange behavior but I think it rules out errors in the probe (as it sends data in a 3-bit wide stream) or the SD Card, which is byte oriented. So it must be in my FPGA where I convert the 3-bit probe stream to 32 bit wide values for the Teensy input port, or in the way I handle the buffer. These errors can occur well within the first buffer, before it rolls over.


 
Last edited:
IIRC, many of the larger SD cards have erase block sizes of 4-16MB. Depending on the prior write history, it may be hard to predict when a block erasure may be scheduled by the card's internal processor. When it does happen, you will see a spike in the busy time. The busy time was not an issue in most of our loggers as data rates were slow and we had sufficient buffer for up to 100mSec of busy time. What did bother us was the spike in card current and generated RFI when the card turned on the internal voltage multiplier to generate the 15-20V needed to erase the block of NAND flash. Many of our oceanographic loggers had piezo-electric shear sensors with output impedances of 500 to 800 megOhms. Not usually a problem as the transimpedance amplifier was close to the sensor and a few meters of salt water damps out environmental RFI. However we did require lots of ferrite filters on all the shielded power supply lines---even with separate battery packs for analog and digital circuitry.

We worked to minimize the issues by always deploying the loggers with new Sandisk Ultra 32GB cards. Thirty-two GB is the minimum for six months at about 2KB/second. 3 months was the normal deployment time, but ship schedules were variable.

About 5 years ago we ran into an issue with counterfeit Sandisk Cards. We learned to spot the packaging anomalies and always checked the metadata when formatting the cards.
I graphed a discharge curve while logging to a V90 card. I used two 70mA-Hr lithium batteries paralleled through ideal diodes. Even though I only took data every ten seconds, you can see voltage dips that look pretty regular and I imagine those are from those block erases. I think I'll put in a separate regulator for the SD Card.

1771102775235.png
 
The max busy time on this latest card certainly takes care of any issue with enough buffer space. Regarding the errors, you haven't posted any of your code or told us much about your software.
  • Are you reading data and writing to the buffer from an ISR?
  • If so, are you sure your buffer is interrupt-safe?
  • If not writing from an ISR, how does data get written to the buffer?
 
Yes, it is probably time to ask for help on that. I wanted to be sure it wasn't a card or probe problem first, and I think they are no longer suspects.

-Writing from an ISR, reading from the main loop.
- I believe it is, but I've been known to be wrong
- ISR

Here is the crux of the code. I do not look to see if the input pointer overruns the output pointer, because there is nothing to be done if it happens. I just have to be sure it's fast enough so it does not happen.

C:
uint32_t buf32[(BUF_SIZE + 3) / 4];  //Insure 4-byte alignment.
volatile uint32_t* inBufPtr = (uint32_t*)buf32;
volatile uint32_t* inEndPtr = (uint32_t*) (inBufPtr + BUF_SIZE/4);
volatile uint32_t* inPtr = inBufPtr;

void ClkInterrupt()
{
   portStatusReg = mask;  // we worked around the Teensyduino handler, so we need to reset the status flag ourself
   *inPtr =  IMXRT_GPIO9_DIRECT; 
   if( ++inPtr >= inEndPtr ) inPtr = inBufPtr;
   inCnt++; // count up words of data
   asm volatile("dsb"); // avoid double calls due to possible bus sync issues

}

in loop:

  while( blocksWritten < blocksWanted )
  {
    while( inCnt < WR_WORDS );  // wait til we have 512 bytes or more
    noInterrupts();
    inCnt -= WR_WORDS;
    interrupts();
 
    if ( (bytes_out = file.write(outPtr, WR_SIZE)) != WR_SIZE )
       Serial.print('#'); // error("write failed");
     
    outPtr += WR_SIZE;  // check for rollover
    if( outPtr == bufEndPtr  ) outPtr = buf;
    blocksWritten++;
  }

I'm running some tests now to try to see if it's on the T4.1 end. It may well be on the FGPA side, but I imagine many of you know Verilog better than I do, so it may be worth showing that code.

update: I changed the code so that the interrupt stored a modulo 32-bit value of how many words had been stored, so the file should simply increment from 0. and it appears to do that. That seems to implicate the FPGA code.
 
Last edited:
update: I changed the code so that the interrupt stored a modulo 32-bit value of how many words had been stored, so the file should simply increment from 0. and it appears to do that. That seems to implicate the FPGA code.

Is it possible to share a complete sketch? I don't see anything obviously wrong, but we can't see the values of any of your macros, etc. I'm curious to see your setup() and how your interrupt is configured. In a very simple test that I did with an interrupt on a digital input and sending a 4 MHz square wave into that pin, an ISR that does almost nothing seemed to take over 50% of the CPU. Your code alludes to working around the TeensyDuino interrupt map, so that must help, but are you sure you're not just running out of CPU? Are you over-clocking?
 
Certainly! Here is the code - I removed a bunch of debug stuff at the end where I do various data checks, as well as comms with the probe, as I don't know if that interface is fully released to the public yet. But you can run the code on T4.1 if you turn off USE_PROBE and jumper pin 0 to pin 1 (data clock out to data clock in). Sorry for the mix of Serial.print and cout - the couts are leftovers from bench.ino.

I measured my interrupt time and it's about 40nsecs for the interrupt vs 222nsec sample time, so yeah, it's significant but do-able. The interrupt needs to be as lean as possible. There is no time do anything with the data other than store it. I do not overclock.

This was most helpful on making a lean interrupt:

https://forum.pjrc.com/index.php?threads/teensy-4-1-interrupt-problem.70821/


C-like:
/*
 * based off bench.ino
 * Read data into SD Card from NP2 through FPGA
 * Requires I2C interface to probe as well as 32 bit data and clock from FPGA
 * and FPGA reset from Teensy
 * This program tests high speed port reads and storage to SD Card
 *
 * NOTE - to use the 32 bit input port:
 *  - The DEBUG_EN (EMC_01) must not be routed to the MKL02Z32 chip
 *  - Undo or (comment out) the external memory setups in startup.c (starting at line 180 and line 344
 */


#define VERSION "20260130"

#include "SdFat.h"
#include "sdios.h"
#include <Wire.h>

#define DUALBUFFER  // use both RAM1 and RAM2, else just RAM1
// #define USE_PROBE  // allow this to set up probe and read from IO port

// SD_FAT_TYPE = 0 for SdFat/File as defined in SdFatConfig.h,
// 1 for FAT16/FAT32, 2 for exFAT, 3 for FAT16/FAT32 and exFAT.
#define SD_FAT_TYPE 2

// Size of read/write.
const size_t BUF_SIZE = 438272; // max from RAM1
const size_t DBUF_SIZE = 505856; // and RAM2
const size_t WR_SIZE = 512;
const size_t WR_WORDS = WR_SIZE/4;

// ==== File size in MB where MB = 1,048576  bytes.
// == 61800 is about an hour
const uint32_t FILE_SIZE_MB = 30900;

const uint64_t FILE_SIZE = 1048576UL * (uint64_t)FILE_SIZE_MB;
const uint32_t blocksWanted = (uint32_t) (FILE_SIZE/WR_SIZE);  // number of SD writes

uint32_t readVal;
uint32_t frameCnt = 0;
int8_t bitData[3][1600];
int row = 0;
int col = 0;
int bitCnt = 0;

const int DCLKpin = 1; // data clock in from FPGA

#define IMXRT_GPIO6_DIRECT  (*(volatile uint32_t *)0x42000000) // port access to GPIO6 (ADB0,ADB1)
#define IMXRT_GPIO7_DIRECT  (*(volatile uint32_t *)0x42004000) // port access to GPIO7 (B0,B1)
#define IMXRT_GPIO8_DIRECT  (*(volatile uint32_t *)0x42008000) // port access to GPIO8 (EMC32+)
#define IMXRT_GPIO9_DIRECT  (*(volatile uint32_t *)0x4200C000) // port access to GPIO9 (EMC0-31)

volatile uint32_t& portStatusReg = (digitalPinToPortReg(DCLKpin))[6]; // precalc status reg and mask
uint32_t mask = digitalPinToBitMask(DCLKpin);

// SDCARD_SS_PIN is defined for the built-in SD on some boards.
#ifndef SDCARD_SS_PIN
const uint8_t SD_CS_PIN = SS;
#else   // SDCARD_SS_PIN
// Assume built-in SD is used.
const uint8_t SD_CS_PIN = SDCARD_SS_PIN;
#endif  // SDCARD_SS_PIN

// Try max SPI clock for an SD. Reduce SPI_CLOCK if errors occur.
#define SPI_CLOCK SD_SCK_MHZ(50)

// Try to select the best SD card configuration.
#if HAS_SDIO_CLASS
#define SD_CONFIG SdioConfig(FIFO_SDIO)
#elif ENABLE_DEDICATED_SPI
#define SD_CONFIG SdSpiConfig(SD_CS_PIN, DEDICATED_SPI, SPI_CLOCK)
#else  // HAS_SDIO_CLASS
#define SD_CONFIG SdSpiConfig(SD_CS_PIN, SHARED_SPI, SPI_CLOCK)
#endif  // HAS_SDIO_CLASS

// Set PRE_ALLOCATE true to pre-allocate file clusters.
const bool PRE_ALLOCATE = true;

// Set SKIP_FIRST_LATENCY true if the first read/write to the SD can
// be avoid by writing a file header or reading the first record.
const bool SKIP_FIRST_LATENCY = true;

//==============================================================================
// End of configuration constants.
//------------------------------------------------------------------------------


// RAM1 buffer
uint32_t buf32[(BUF_SIZE + 3) / 4];  // Insure 4-byte alignment.
uint8_t* buf = (uint8_t*)buf32;
volatile uint32_t* inBufPtr = (uint32_t*)buf32;
uint8_t* bufEndPtr = (uint8_t*) (buf + BUF_SIZE);
volatile uint32_t* inEndPtr = (uint32_t*) (inBufPtr + BUF_SIZE/4);
volatile uint32_t* inPtr = inBufPtr; //+ 4;  // !!!!!
uint8_t* outPtr = buf;

// RAM2 buffer
volatile uint32_t DMAMEM Dbuf32[DBUF_SIZE/4] __attribute__((aligned(32)));
uint8_t* Dbuf = (uint8_t*)Dbuf32;
volatile uint32_t* DinBufPtr = (uint32_t*)Dbuf32;
uint8_t* DbufEndPtr = (uint8_t*) (Dbuf + DBUF_SIZE);
volatile uint32_t* DinEndPtr = (uint32_t*) (DinBufPtr + DBUF_SIZE/4);

volatile uint32_t inCnt = 0;
uint64_t wrCnt = 0;
uint32_t blocksWritten = 0;
volatile uint32_t outCnt = 0;


#if SD_FAT_TYPE == 0
SdFat sd;
File file;
#elif SD_FAT_TYPE == 1
SdFat32 sd;
File32 file;
#elif SD_FAT_TYPE == 2
SdExFat sd;
ExFile file;
#elif SD_FAT_TYPE == 3
SdFs sd;
FsFile file;
#else  // SD_FAT_TYPE
#error Invalid SD_FAT_TYPE
#endif  // SD_FAT_TYPE


// Serial output stream
ArduinoOutStream cout(Serial);
//------------------------------------------------------------------------------
// Store error strings in flash to save RAM.
#define error(s) sd.errorHalt(&Serial, F(s))
//------------------------------------------------------------------------------
void cidDmp() {
  cid_t cid;
  if (!sd.card()->readCID(&cid)) {
    error("readCID failed");
  }
  cout << F("\nManufacturer ID: ");
  cout << uppercase << showbase << hex << int(cid.mid) << dec << endl;
  cout << F("OEM ID: ") << cid.oid[0] << cid.oid[1] << endl;
  cout << F("Product: ");
  for (uint8_t i = 0; i < 5; i++) {
    cout << cid.pnm[i];
  }
  cout << F("\nVersion: ");
  cout << int(cid.prv_n) << '.' << int(cid.prv_m) << endl;
  cout << F("Serial number: ") << hex << cid.psn << dec << endl;
  cout << F("Manufacturing date: ");
  cout << int(cid.mdt_month) << '/';
  cout << (2000 + cid.mdt_year_low + 10 * cid.mdt_year_high) << endl;
  cout << endl;
}
//------------------------------------------------------------------------------
void clearSerialInput() {
  uint32_t m = micros();
  do {
    if (Serial.read() >= 0) {
      m = micros();
    }
  } while (micros() - m < 10000);
}
//------------------------------------------------------------------------------
volatile boolean first = true;
volatile boolean firstVal = 0;

// =====================================
// ====== C L K   I N T E R R U P T ====
// =====================================

volatile uint32_t pos = 0;
volatile  uint32_t maxBuffUsage = 0;
volatile uint32_t wordCnt = 0;
 
void ClkInterrupt()
{
   portStatusReg = mask;  // we worked around the Teensyduino handler, so we need to reset the status flag ourself

#ifdef USE_PROBE
   *inPtr =  IMXRT_GPIO9_DIRECT;  // 32 bits from FPGA
#else
   *inPtr = wordCnt++; // inCnt; // = or store seomething usefaul for debug
#endif

#ifdef DUALBUFFER           
   if( ++inPtr == inEndPtr ) inPtr = DinBufPtr; // reset buffer pointer if needed
   if( inPtr == DinEndPtr ) inPtr = inBufPtr;
#else
   if( ++inPtr >= inEndPtr ) inPtr = inBufPtr;
#endif
   inCnt++; // count up words of data
   asm volatile("dsb"); // avoid double calls due to possible bus sync issues
} 


//------------------------------------------------------------------------------
void setup()
{
  pinMode( DCLKpin, INPUT);
  pinMode( 13, OUTPUT);
  digitalWriteFast(13, LOW);

// set up NRST3v3 pin as output
  GPIO8_GDIR |= (1<<4);  // bit 4, GPIO8
  IOMUXC_SW_PAD_CTL_PAD_GPIO_SD_B1_04 = IOMUXC_PAD_DSE(7);
  IOMUXC_SW_MUX_CTL_PAD_GPIO_SD_B1_04 = 5 | 0x10;

  GPIO8_DR_CLEAR = (1<<4);  // set NRST3V3 low - hold off run

  GPIO9_GDIR = 0;  // digital-in bus is all inputs 

  Wire.begin(); //(I2C_MASTER, 0x00, I2C_PINS_18_19, I2C_PULLUP_EXT, 400000);
  
  Serial.begin(9600);
  while (!Serial); 

//Serial.println( (unsigned int)inEndPtr, HEX);
//while(1);
 
  if (CrashReport) Serial.print(CrashReport);

  Serial.print("ver: ");
  Serial.println(VERSION);
  Serial.println(__FILE__);
  Serial.printf( "%s  %s\n", __DATE__, __TIME__ );
  Serial.printf( "Teensyduino version %1lu\n", TEENSYDUINO );
  Serial.printf( "SdFat version %s\n", SD_FAT_VERSION_STR );

 
  delay(1000);
  cout << F("\nUse a freshly formatted SD for best performance.\n");
  if (!ENABLE_DEDICATED_SPI)
  {
    cout << F(
        "\nSet ENABLE_DEDICATED_SPI nonzero in\n"
        "SdFatConfig.h for best SPI performance.\n");
  }
  cout << uppercase << showbase << endl; // use uppercase in hex and use 0X base prefix
 
#ifdef USE_PROBE 
    // read  probe info (this mat be proprietary - so omitted for the forum)
#else  // use self generated clock from pin 0
  #define PWMRES 4        // PWM resolution 8 bits = 256 steps
  #define PWMSTEPS 16   // to match PWMRES: there are 256 steps
  const int amp = PWMSTEPS/2 - 1;
 
  analogWriteRes(PWMRES); // write PWM resolution
  analogWriteFrequency(0,4500000); //4500000);
  analogWrite(0, amp);
#endif

  Serial.print(" file size = ");
  Serial.println( FILE_SIZE );
  Serial.print(" file time = ");
  Serial.println( FILE_SIZE/18000000);
}
//------------------------------------------------------------------------------
void loop()
{
  uint32_t t;

  // Discard any input.
  clearSerialInput();

  // F() stores strings in flash to save RAM
  cout << F("Type any character to start\n");
  while (!Serial.available()) {
    yield();
  }
#if HAS_UNUSED_STACK
  cout << F("FreeStack: ") << FreeStack() << endl;
#endif  // HAS_UNUSED_STACK

  if (!sd.begin(SD_CONFIG)) {
    sd.initErrorHalt(&Serial);
  }
  if (sd.fatType() == FAT_TYPE_EXFAT) {
    cout << F("Type is exFAT") << endl;
  } else {
    cout << F("Type is FAT") << int(sd.fatType()) << endl;
  }

  cout << F("Card size: ") << sd.card()->sectorCount() * 512E-9;
  cout << F(" GB (GB = 1E9 bytes)") << endl;

  cidDmp();

  // open or create file - truncate existing file.

  if (!file.open("bench.dat", O_RDWR | O_CREAT | O_TRUNC)) error("open failed");

  cout << F("FILE_SIZE = ") << FILE_SIZE << endl;
  cout << F("WR_SIZE = ") << WR_SIZE << endl;
  cout << F("WR_WORDS = ") << WR_WORDS << endl;
  cout << F("WORDS TO WRITE = ") << FILE_SIZE/4 << endl; 
#ifdef DUALBUFFER 
  cout << F("BUF_SIZE = ") << BUF_SIZE+DBUF_SIZE << F(" bytes\n");
  Serial.print("run time(secs): ");
  Serial.println( FILE_SIZE/18000000 );
  Serial.println(BUF_SIZE);
  Serial.println(DBUF_SIZE);
  Serial.println((unsigned int)bufEndPtr);
  Serial.println((unsigned int)DbufEndPtr);
#else
  cout << F("BUF_SIZE = ") << BUF_SIZE << F(" bytes\n");
#endif

#ifdef DUALBUFFER
  Serial.println("Dual Buffer");
#else
  Serial.println("Single Buffer"); 
#endif

  // do write test
 // uint32_t n = FILE_SIZE / BUF_SIZE;
#ifdef USE_PROBE
   // set up probe
#endif

  if (PRE_ALLOCATE)
  {
    if (!file.preAllocate(FILE_SIZE))  error("preAllocate failed");
  }

Serial.println("allocatd");

delay(3000);

  // start up data read interrupt after everything is reday to go
  attachInterrupt(DCLKpin, nullptr, RISING); //CHANGE); // let Teensyduino do the setup work
  attachInterruptVector(IRQ_GPIO6789,ClkInterrupt); // override Teensyduino handler and invoke the callback directly
  NVIC_ENABLE_IRQ(IRQ_GPIO6789);
  NVIC_SET_PRIORITY(IRQ_GPIO6789, 64);  // 0;// highest priority, might be good to reduce a bit
    
  Serial.print(inCnt);
  Serial.println("go");
    
  GPIO8_DR_SET = (1<<4);   // let the FPGA run

#ifdef USE_PROBE   
    // tell probe start sending data
#endif
  
  t = millis(); 
  uint32_t bytes_out;
 
  //  ===== Here is where it all happens ========
  //
  while( blocksWritten < blocksWanted )
  {
    while( inCnt < WR_WORDS );  // wait til we have 512 bytes or more  //(wrCnt + WR_WORDS) ); // wait until we have > WR_SIZE bytes( WR_SIZE/4 words) of new data 
    noInterrupts(); 
    inCnt -= WR_WORDS; 
    if( inCnt > maxBuffUsage ) maxBuffUsage = inCnt;
    interrupts();
      
    outCnt++;
    if( outCnt > (FILE_SIZE/512) )   GPIO8_DR_CLEAR = (1<<4);  // hold FPGA in reset
 
    if ( (bytes_out = file.write(outPtr, WR_SIZE)) != WR_SIZE ) //BUF_SIZE) != BUF_SIZE) {
       Serial.print('#'); // error("write failed");
        
    outPtr += WR_SIZE;
#ifdef DUALBUFFER
    if( outPtr == bufEndPtr  ) outPtr = Dbuf;
    if( outPtr == DbufEndPtr ) outPtr = buf;
#else
    if( outPtr  == bufEndPtr  ) outPtr = buf;
#endif
    blocksWritten++; 
  }
 
  // don't need any more data
  GPIO8_DR_CLEAR = (1<<4);  // hold FPGA in reset
  detachInterrupt(DCLKpin);
  analogWrite(0, 0);
 
  file.flush(); // sync();
  t = millis() - t;
 // s = file.fileSize();
  Serial.println( outCnt);
  Serial.println( (unsigned int)outPtr);
  Serial.println( (unsigned int)inPtr);
  Serial.println( (unsigned int)buf);
  Serial.println( inCnt);

  Serial.print("max Buf Usage (bytes): ");
  Serial.println(maxBuffUsage*4);

  Serial.print("end : ");
  Serial.println(blocksWritten);
  Serial.print("bytes:");
  Serial.println(file.fileSize());
  Serial.print("millis:");
  Serial.println(t);
  Serial.print("bytes/sec:");
  Serial.println((file.fileSize()/t ) );


  Serial.print("Total frames: ");
  frameCnt = file.fileSize()/600;
  Serial.println(frameCnt);

  file.close();
  sd.end();
  GPIO8_DR_CLEAR = (1<<4);  // hold FPGA in reset
  while(1);
}
 
Thanks for sharing the code and also the link to the thread on reducing interrupt overhead. The sketch embedded in that thread ran with no changes and I got the same values. If I understand the numbers correctly, the time from the triggering edge to the recording of the timestamp in the ISR is ~22 cycles of the 600-MHz clock (37 ns). When you say your ISR is 40 ns, I assume you mean in addition to the 37 ns interrupt latency, for a total of ~77 ns? That puts you 77/222 = 35% of CPU. We know that writing to SD at 18 MB/s takes abount 18%, for a total of 53%, so maybe it's okay.

I did a quick test where the ISR just increments a counter, and loop() prints the number of executions in each second. That seemed to work okay up to ~11.5 MHz, which fits pretty well with 4.5 MHz consuming ~35% CPU.

I'm curious to see what happens if I try modifying my SD test sketch to use the "lean" interrupt and write a smaller number of bytes to the RingBuf at a much higher frequency.
 
Yes - that is the time within the ISR and does not include the latency to call the ISR (and to return) I had set a pin high at the beginning of the ISR and lowered it at the end. And then subtracted the time for digitalWriteFast (about 4 nsec) to get the ~40nsecs value.

I should test again and measure from the rising edge of the data clock to the return to get a better number for the ISR time.
 
I haven't run your program yet, but I don't see anything wrong in the buffer logic. I don't understand what constitutes an "error" in your data, but one suggestion would be to define a uint32_t isr_count, increment on each interrupt and write that value to the buffer. When the logging is complete, you should have a file with 0, 1, 2, etc. If that looks good, the problem would have to be in your FPGA?

FYI, I got a DELKIN 128 GB UHS-II V90 card, and max busy times are 7-8 ms, compared to ~40 with the Sandisk cards. I ran my sketch for 300 sec at 20 MB/s with no problems. I assume you are doing your testing now without the dual buffer, since you don't need it?
 
Yes, I did exactly that, and at the end I programmed the teensy to read the file and compare stored values against a counter in the loop. This was on a stock T4.1. It reported no errors. My next step is to run it on my board, then, if that is good, which it ought to be, is to have the FPGA do the same; send out incrementing data rather than the probe data.

In most of the errors I’ve seen, periodically (~6%) one of the 32-bit values is wrong (not just a single byte). Strangely, it usually has the value of the next 32-bit value. So the error is prescient. Edit: I’m sitting here trying to figure out how that could happen. One mechanism would be if the clock for a data value was delayed until the next value was set up and then that value was double clocked. Looking at it, I don’t see how that could happen in the FPGA code, but maybe it’s noise or track length or other circuit related problem. I can check for that with the scope when I get into work.

I used the dual buffer. The extra overhead is minimal, and if it works, why not? You never know when the card might decide to take a longer break.

BTW, one down side of a faster SD card is higher current, and subsequently, shorter battery life.
 
Last edited:
I used the dual buffer. The extra overhead is minimal, and if it works, why not? You never know when the card might decide to take a longer break.

I would tend to go the other way and keep everything in TCM if possible, since it seems to be more than enough. One little thing I noticed is that you are using priority 64 for the clock interrupt. Teensy's 1-kHz systick is 32, and while systick_isr() is very short, your clock interrupt runs 4500 times more often, so its priority should be higher (lower number). If you haven't tried it lately, do a test with priority = 0 and see if it matters. One style point I would make is that it would be helpful if your initialization of the buffer sizes made clear that they are multiples of 512, and must be for correct operation.
 
Running an interrupt at that frequency for data collection is crazy - use a DMA channel instead.

I trust that you're right, but I have to admit that I know next to nothing about DMA on Teensy. This is such a simple application, it would be really helpful to work through it as an example for DMA usage. The OP's sketch simply reads a 32-bit GPIO register on the rising edge of an external clock and writes that data to a ring buffer. Can you explain in general terms how you would approach doing this with DMA?
 
I would tend to go the other way and keep everything in TCM if possible, since it seems to be more than enough. One little thing I noticed is that you are using priority 64 for the clock interrupt. Teensy's 1-kHz systick is 32, and while systick_isr() is very short, your clock interrupt runs 4500 times more often, so its priority should be higher (lower number). If you haven't tried it lately, do a test with priority = 0 and see if it matters. One style point I would make is that it would be helpful if your initialization of the buffer sizes made clear that they are multiples of 512, and must be for correct operation.
Well, dang. I thought I had read that higher was higher priority. Good catch!
I wonder if I could also disable the systick_isr - I imagine it would only impact things like 'delay', which I can work around.


UPDATE: I changed the priority, but no go.
- and I added a note in the code about buffer size - thanks!
 
Last edited:
Running an interrupt at that frequency for data collection is crazy - use a DMA channel instead.
I looked at that a bit, but, in my understanding, DMA can only access the slow version of the registers. I don't think that would be fast enough.
 
HA!
I think I figured it out. My FPGA sets up the next data and sets the data clock low. Then, half way to when it is time to set up the next data, it sets the output data clock high. The Teensy clock interrupts on the rising edge of the data clock. But it doesn't latch the data until it is in the interrupt routine. Evidently, the time from the clock edge until the actual read takes place might sometimes happen just after the FPGA has set up the next value. That explains why sometimes I would get a prescient value. It was reading that value twice; once late and then on time. I verified with a scope by adding a digital out high just after data is read, and sure enough, that high would sometime occur after the data from the FPGA had been updated. I'm working on fixing that, so this may be premature, but it does make sense.

It also explains why, when I stored an incrementing count, it worked, as the count was incremented after the save of the current value.

All the talk about interrupt speed got me thinking, thanks.
 
Last edited:
HA!
I think I figured it out. My FPGA sets up the next data and sets the data clock low. Then, half way to when it is time to set up the next data, it sets the output data clock high. The Teensy clock interrupts on the rising edge of the data clock. But it doesn't latch the data until it is in the interrupt routine.
I gather you wanted the sketch to read the data about halfway through the time it is valid, but if the FPGA sets the clock low after asserting the data, couldn't you just change your sketch to interrupt on the falling edge? There is a built-in delay of ~37 ns from the interrupt edge until the read actually occurs.
 
Of course! The interrupt delay gives me plenty of setup time. As it turns out, the way the FPGA code runs, your idea misses the first data value, but I just changed the clock sense in the FPGA so it goes high when the data is changed - so same concept.

Here is when it didn't work - 0 is my test pin out (goes high just after saving data, low at end of interrupt), 1 is the data clock, and 2 is bit 0 which I am making toggle each clock. You can see the first two test pin transitions happen when the data is 0. The previous '1' data was missed.

1771270682204.png


And with your fix. The data is captured well before the halfway point. If needed I could always tweak it so I wait a bit after setting up the data before setting the clock high, but I think this is good enough.

1771270829697.png
 
Did you FPGA fix solve the errors? Let us know how it goes from here. I'm interested in what you see for SD max busy time in your long runs.
 
I sent the data to the person that is writing the host code to check it, but from the checks I have done it looks good.
I will be running more tests, with various cards, looking at buffer and battery usage. I will report back. I'll also sum up everything I have learned.
 
I will be running more tests, with various cards, looking at buffer and battery usage. I'll also sum up everything I have learned.
Sounds good. I don't know if this custom board will ever have to do anything else in parallel with the data logging, but if it does, keep in mind that as your program is currently structured, the call to sd::write() will block during those long SD busy times. If you want to avoid that, you can use isBusy() as in the TeensySdioLogger example (and my sketch) and only call write() when it returns false.
 
I'm trying to have it do nothing else during data collection, but that's a good idea to keep in mind.
 
Back
Top