Code ported from Teensy3.0 to 3.2 - now flakey

I have made some further investigation.

I am now using a Teensy3.5 to generate a sequence of 4us "read-data" pulses and some fake index pulses. See my posting in another thread about using DMA and the PIT timers to stream an 8-bit array to a 8-bit GPIO port.

I have connected the fake read-data and index pulses to the Teensy3.0. floppy disk read sketch. I am running some debug code which simply catches FTM timer values and displays them in a condensed histogram. I have programmed the FTM to run at 12MHz. Each 4us pulse should in theory result in a captured value of 48. I capture only 700 events(pulses), limited by available RAM on the Teensy3.1.

Code:
Total number of event captured 700
00000 _____ _____ _____ _____ _____ _____ 00001 _____ <- runt pulse value
00008 _____ _____ _____ _____ _____ _____ _____ _____
00016 _____ _____ _____ _____ _____ _____ _____ _____
00024 _____ _____ _____ _____ _____ _____ _____ _____
00032 ***** _____ _____ _____ _____ _____ _____ _____
00040 _____ _____ _____ _____ _____ _____ 00698 00001
00048 _____ _____ _____ _____ _____ _____ _____ _____
00056 _____ _____ _____ _____ ***** _____ _____ _____
00064 _____ _____ _____ _____ _____ _____ _____ _____
00072 _____ _____ _____ _____ _____ _____ _____ _____
00080 _____ _____ _____ _____ ***** _____ _____ _____
00088 _____ _____ _____ _____ _____ _____ _____ _____
00096 _____ _____ _____ _____ _____ _____ _____ _____
00104 _____ _____ _____ _____ _____ _____ _____ _____
00112 _____ _____ _____ _____ _____ _____ _____ _____
00120 _____ _____ _____ _____ _____ _____ _____ _____

The value of 46 is always captured, but this is explained by the time taken to stop, reset and re-start the FTM timer, so we lose a little time.

I edited in the asterixis to show the values used to bin the values into: too short; short(4us); medium(6us); long(8us).

I am not bothered by the runt pulse whose value likely comes from the first captured value. The repeatability is amazing! When I do further captures, the result is nearly always exactly the same (including the single captured value of 47 and the runt pulse).

Code:
Total number of event captured 700
00000 _____ _____ _____ _____ _____ _____ 00001 _____
00008 _____ _____ _____ _____ _____ _____ _____ _____
00016 _____ _____ _____ _____ _____ _____ _____ _____
00024 _____ _____ _____ _____ _____ _____ _____ _____
00032 _____ _____ _____ _____ _____ _____ _____ _____
00040 _____ _____ _____ _____ _____ 00001 00697 _____
00048 00001 _____ _____ _____ _____ _____ _____ _____
00056 _____ _____ _____ _____ _____ _____ _____ _____
00064 _____ _____ _____ _____ _____ _____ _____ _____
00072 _____ _____ _____ _____ _____ _____ _____ _____
00080 _____ _____ _____ _____ _____ _____ _____ _____
00088 _____ _____ _____ _____ _____ _____ _____ _____
00096 _____ _____ _____ _____ _____ _____ _____ _____
00104 _____ _____ _____ _____ _____ _____ _____ _____
00112 _____ _____ _____ _____ _____ _____ _____ _____
00120 _____ _____ _____ _____ _____ _____ _____ _____

It took quite a few runs to capture a slight deviation.

It confirms my observation looking at the histogram captured from genuine double density floppy disk data, the timer results captured by the Teensy3.0 are very repeatable. As a consequence when the 3 pulse lengths of 4us, 6us, and 8us are "binned" the repeatability results in a clean data stream decode. NOTE: the MFM encoding/decoding scheme is very sensitive to even a single pulse mis-bin.
 
I replaced the Teensy3.0 with a Teensy3.1 and re-ran the experiment.

Code:
Total number of event captured 700
00000 _____ _____ _____ _____ _____ _____ _____ _____
00008 _____ _____ 00001 _____ _____ _____ _____ _____
00016 _____ _____ _____ _____ _____ _____ _____ _____
00024 _____ _____ _____ _____ _____ _____ _____ _____
00032 _____ _____ _____ _____ _____ _____ _____ _____
00040 _____ _____ _____ _____ _____ _____ 00698 _____
00048 _____ _____ _____ _____ _____ _____ _____ _____
00056 _____ _____ _____ _____ _____ _____ _____ _____
00064 _____ _____ _____ _____ _____ _____ _____ _____
00072 _____ _____ _____ _____ _____ _____ _____ _____
00080 _____ _____ _____ _____ _____ _____ _____ _____
00088 00001 _____ _____ _____ _____ _____ _____ _____
00096 _____ _____ _____ _____ _____ _____ _____ _____
00104 _____ _____ _____ _____ _____ _____ _____ _____
00112 _____ _____ _____ _____ _____ _____ _____ _____
00120 _____ _____ _____ _____ _____ _____ _____ _____
number of long events captured 0
reading track to test pulse delta timings...
Total number of event captured 700
00000 _____ _____ _____ _____ _____ _____ _____ _____
00008 _____ _____ _____ 00001 _____ _____ _____ _____
00016 _____ _____ _____ _____ _____ _____ _____ _____
00024 _____ _____ _____ _____ _____ _____ _____ _____
00032 _____ _____ _____ _____ 00001 _____ _____ _____
00040 _____ _____ 00001 _____ _____ _____ 00694 _____
00048 _____ _____ _____ 00001 _____ _____ _____ _____
00056 00001 _____ _____ _____ _____ _____ _____ _____
00064 _____ _____ _____ _____ _____ _____ _____ _____
00072 _____ _____ _____ _____ _____ _____ _____ _____
00080 _____ _____ _____ _____ _____ _____ _____ _____
00088 _____ _____ _____ _____ _____ _____ _____ _____
00096 _____ _____ _____ _____ _____ _____ _____ _____
00104 _____ _____ _____ _____ _____ _____ _____ _____
00112 _____ _____ _____ _____ _____ _____ _____ _____
00120 _____ _____ _____ _____ _____ _____ _____ _____
number of long events captured 1
reading track to test pulse delta timings...
Total number of event captured 700
00000 _____ _____ _____ _____ _____ _____ _____ _____
00008 _____ _____ _____ 00001 _____ _____ _____ _____
00016 _____ _____ _____ _____ 00001 _____ _____ _____
00024 _____ _____ _____ _____ _____ _____ _____ _____
00032 _____ _____ _____ 00001 _____ _____ _____ _____
00040 _____ _____ _____ 00001 00001 _____ 00692 _____
00048 00001 00001 _____ _____ _____ _____ _____ _____
00056 _____ 00001 _____ _____ _____ _____ _____ _____
00064 _____ _____ _____ _____ _____ _____ _____ _____
00072 _____ _____ _____ _____ _____ _____ _____ _____
00080 _____ _____ _____ _____ _____ _____ _____ _____
00088 _____ _____ _____ _____ _____ _____ _____ _____
00096 _____ _____ _____ _____ _____ _____ _____ _____
00104 _____ _____ _____ _____ _____ _____ _____ _____
00112 _____ _____ _____ _____ _____ _____ _____ _____
00120 _____ _____ _____ _____ _____ _____ _____ _____

Immediately you can see a substantial variation in a number of captured values. Not always since the first run was clean, apart from the outlier at 88.

It is exactly the same code, I only changed the Teensy type in the "Tools->Board" menu (and re-compiled).

Again I will mention that I am overclocking the CPU at 96MHz, so this can't be ruled out as a reason....until I tried compiling the Teensy3.1 at stock 72MHz (*) and the variation of the captured values remained the same.

(*) the data stream decode won't work at this CPU speed, but the debug code runs fine since the interrupt code to capture values is tiny and trivial (plenty of time at 72MHz).
 
I just noticed that for the Teensy3.1, for every "slightly too long" captured value, there is a matching "too short value".

Looking at the middle table.
Expected value = 46.
Pair values 51 and 43. (43+51)/2 => 47.
Pair values 56 and 36. (56+36)/2 => 46.

It seems to confirm that something is delaying the triggering of the interrupt routine, so it captures a larger than expected timer value, then on the next interrupt it captures a corresponding smaller timer value.
 
Last edited:
It's been nearly a year, but I am re-visiting this issue.

It has really bugged me as to what is going on.

I have now done some work to generate a clean data stream, from a Teensy3.5, to help investigate. The new code will allow me to generate a double density Atari floppy disk compatible format data stream, to either write a sector or even write a track. My hope was to use a Teensy3.5 to be able to read and write from a 3.5inch floppy (possibly even HD format) to give me a tool to connect to other projects, possibly via SPI or serial.

So the T3.0 code works flawlessly. Unfortunately this teensy hasn't enough RAM or CPU speed for my end goal.

The code on the T3.1 is flaky.
The code on the T3.2 is flaky.
The code on the T3.5 is also flaky (new discovery).

I realise pursuing a solution is sort of academic, as the T3.X series is no longer available and few are likely to run into the same problem. BUT is is really bugging me.

To summarise. I am decoding a stream of pulses, with deltas on the rising edge of 4us, 6us and 8us. I am using the FTM timer (clocked at 60MHz) to measure to time delta. Each pulse causes an interrupt which 1) grabs the current timer value then 2) zeroes the timer.

Using the new code on the T3.5 I can generate a very accurate stream of 1's and 0's at a rate of 1MHz by means of DMA.
I have several "debug" sequences, and a sequence which is the stream for a sector header and data body part.

The "sector" sequence gets decoded 100% of the time when code runs on the T3.0, but less than 80% on the other T3.X CPUs.

My guess is that either the interrupt start is delayed or that the interrupt is interrupted and so delayed. I use digital pins and a Rigol scope to debug. I think that both conditions can happen on the T3.1/2/5 . The question is what is causing this. Hopefully it is simply some hardware feature which I can turn off in software.

Here is the ISR code, cut down to the very minimum.

Code:
FASTRUN void FDD_stream_decode_ISR(void) {
  uint8_t  timer;
  uint32_t isfr = PORTD_ISFR;
  PORTD_ISFR = isfr;

  digitalWriteFast(debugpin,HIGH);
  //-----------------------------------------------------------------------------------------------------
  FTM0_SC  = FTM_SC_CLKS(0) | (PRESCALE);  // STOP TIMER
  timer    = (uint8_t)(FTM0_CNT);          // READ COUNT
  FTM0_CNT = 0;                            // RESET COUNT
  FTM0_SC  = FTM_SC_CLKS(1) | (PRESCALE);  // RESTART TIMER

  if (readmode==RECORD_EVENTS) {
    if (events<MAX_EVENT_COUNT) { eventdata[events++] = timer; } // RECORD //
    digitalWriteFast(debugpin,LOW);
  }
}

So I would expect to have the same execution time, since it is only recording timer values. BTW "events" can never get too big with the debug sequence (so the if is always executed).

The debug sequence is a stream of 1us pulses at a repetition rate of 6us (oops I found a bug in my code it should have been 4us).
The blue is the serial stream generated on the T3.5. The yellow is the output from a debug digital pin on the T3.1 .
It takes about 500ns to enter the interrupt and another 500ns to execute the above code.

INT event normal.png


Using the T3.1 as an example, and the sophisticated triggering of the Rigol scope, I captured the following.


INT event elongation.png


This might explain why I am getting "rouge" timer values. What on earth has caused the interrupt to be 1us longer?
My only guess is something else (an interrupt?) of higher priority is happening which takes 1us to execute.

The SYSTICK interrupt has been disabled and the PORTD interrupt has been set to the highest priority. I think details appear earlier in the thread.

I would really appreciate any help to get to the bottom of this.

The T3.0 has a different CPU to the T3.1/2/5, but despite me viewing the teensy hardware source code, I can't find what the source of the problem is.

I would like to complete the project using a T3.5, which has a faster CPU and enough RAM for generating the write data stream. It's faster CPU may allow HD disks from PC to be decoded.

Thank you.
 
Just to be sure I would add asm volatile("dmb"); after the digitalWriteFast statement that sets the pin low, to ensure that operation has completed before the end of the ISR. And also move it out of the if clause; you are unconditionally setting the pin high at the start of the ISR so it would make sense to unconditionally lower it.
 
Just to be sure I would add asm volatile("dmb"); after the digitalWriteFast statement that sets the pin low, to ensure that operation has completed before the end of the ISR. And also move it out of the if clause; you are unconditionally setting the pin high at the start of the ISR so it would make sense to unconditionally lower it.
Thank you for taking the time to reply.

Apologies. I cut and pasted code, leaving out a big section [ if (readmode==DECODE_EVENTS) {...} ] which wasn't relevant for debug, also omitting the final [ asm volatile("dsb"); ] (by mistake) which I believe is similar to to the asm you suggest.
 
Progress!

The problem appears to be fixed for the T3.1 and T3.1 boards.
Code for the T3.5 does work but needs further debug.

Before calling the code which reads and decodes the data stream (on the fly) and where timings are critical, when I add the following two lines the returned timer values become stable and decode of real sector data is 100% reliable.

C:
NVIC_DISABLE_IRQ(IRQ_USBOTG);
NVIC_DISABLE_IRQ(IRQ_USBDCD);

I re-enable these interrupts after the code segment which does the floppy data read/decode.

I now use Coolterm rather than TerTerm (it appeared to lose USB serial data - see my other posts) and disabling and re-enabling these interrupts did not disturb USB serial operation.

So I did have a suspicion that a spurious interrupt was being executed which caused the read data port interrupt to be delayed. So I did something I should have done ages ago. I found all the defines in kinetis.h and used NVIS_DISABLE_IRQ for all interrupts except the PORTA/B/C/D/E interrupts just before the critical code segment, then re-enabled them afterwards. This resulted in a clean set of timer values as shown in my debug code. I then selectively commented out lines disabling each IRQ until I found the ones which caused a problem.

AHA!!! ... in checking something before posting ... I may just now have found the REAL problem.

C:
const byte fromFDDreadPinIRQ  = 43; // IRQ_PORTD = 43,   // kinetis.h (Teensy 3.0)

 NVIC_SET_PRIORITY(fromFDDreadPinIRQ, 0);   // HIGHEST priority

void enable_FDD_stream_decode_ISR(void) {
  NVIC_ENABLE_IRQ(fromFDDreadPinIRQ);
}

I was just checking kinetis.h again. To be my surprise (horror!) I see that the IRQ number is not the same for PORTD between the different versions of the teensy 3 series!!! It is 43 for T3.0 (as my comment says), but is 90 for T3.1 and T3.2 .

I now need to go away and check this. In other words only for T3.0 was the interrupt priority set to the highest, which explains why other interrupts were able to disturb/delay the port interrupt from the floppy.
 
Last edited:
When I remove the USB IRQ disables (mentioned above) and instead use the correct IRQ for the T3.2, so setting the port interrupt as the highest priority, the code works flawlessly.

C:
const byte fromFDDreadPinIRQ  = 43; // IRQ_PORTD = 43,   // kinetis.h (Teensy 3.0)

I last worked on this code over a year ago. Note that the #define from kinetis.h "IRQ_PORTD" is commented out in the line above. Looks as if I might have used this #define at some point. I have no idea why I hard coded the number.
 
Back
Top