Code ported from Teensy3.0 to 3.2 - now flakey

badsector

Well-known member
Following on from a previous thread regarding serial data corruption over USB, likely due to using the 96MHz overclock on a Teensy3.0.

The project uses a Teensy3.0 to decode the read-data stream from a double density disk in an Atari ST disk drive.

I replaced the 3.0 with a 3.2 board. I copied the sketch in case changes were needed, however the code compiled without needing any changes. Unfortunately reading data from floppy disk sectors has become very unreliable with about 1 read in 10 resulting in the correct data. The same code on the 3.0 was rock solid.

My first guess was that something was different with respect to the FTM timer source clock. When reading the read-data pulses from the floppy, I simply need to be able to distinguish between long, medium and short pulses. I have some test code which prints a crude histogram. Using this I was able to see where to set the limit values. On the 3.0 the histogram was surprisingly repeatable. On the 3.2 it looks different, but I think that it is more or less the same shape.

I was concerned that the clock used to clock the FTM might be different on the 3.2 as compared to the 3.0. I am no expert with the hardware of the Teensy CPU. I think that I selected the "system clock" to clock the FTM, but TBH I don't think I really knew it's frequency. I prescaled it by dividing by 4, which gave reasonable numbers in the range 0 to 127 when reading back the count. Not the most professional way, in that I didn't rigorously understand how the FTM worked/was clocked.

A FTM count value of 70 is about average for a 6us edge-to-edge data bit. I guess that makes "system clock" for the 3.0 as 48MHz (NOTE I am using a 96MHz overclock), which I think makes sense (48/4 = 12 , 70/6 = 11.6).

So on the 3.2 what is the frequency of the system clock (with 96MHz overclock)?

I can also display the raw data bytes. An interesting observation for the 3.2 is that the header is always read correctly, and the CRC is always correct. This is only a short burst of pulses, the header being 4 bytes long. When a sector reads incorrectly it is obvious from not only the ASCII hex bytes dumped to the serial terminal, but also because the CRC is wrong.

One guess is that I am getting an "unexpected" interrupt during the sector read which changes the value for one pulse or even loses a pulse. This will corrupt the data stream from that point. I wonder if this is what I am seeing? Any rogue pulse value causes the decode state machine to go wrong and it can't re-sync until the next header synchronisation "missing clock".

Note: while reading a sector I switch off the "systick" interrupt, which would otherwise cause the same problem.

On the 3.2 are there any new sources of interrupt which might be happening, and which would impact the FTM timer values?

It could possibly be a hardware problem causing a glitch. I'm just searching for a possible software reason first.

Relevant code?

From void setup()

C:
// setup timer config which doesnt change
  FTM0_MODE = FTM_MODE_FTMEN | FTM_MODE_WPDIS;
  FTM0_MOD  = 0xFFFF;
  // disable to begin with - done by setting clock to disabled!
  FTM0_SC   = FTM_SC_CLKS(0) | (PRESCALE); // disable clock + no prescale

Interrupt routine called on every falling edge of the floppy read data (truncated)

C:
FASTRUN void FDD_stream_decode_ISR(void) {
  uint8_t timer;
  uint32_t isfr = PORTD_ISFR;
  PORTD_ISFR = isfr;


  digitalWriteFast(debugpin,HIGH);
  FTM0_SC  = FTM_SC_CLKS(0) | (PRESCALE);  // STOP TIMER
  timer    = (uint8_t)(FTM0_CNT&0xff);     // READ COUNT
  FTM0_CNT = 0;                            // RESET COUNT
  FTM0_SC  = FTM_SC_CLKS(1) | (PRESCALE);  // RESTART TIMER

  if (readmode==DECODE_EVENTS) {
    if      (TOOSHORT_DELTA) { badbitcnt++; } // bad pulse - too short!


    else if (SHORT_DELTA_4US) { // another same bit 0..0 or 1..1
      if (lastbit==0) { bitstream=(bitstream<<1)|0; syncbits=(syncbits<<1)|0; bitcnt++; }
      else            { bitstream=(bitstream<<1)|1; syncbits=(syncbits<<1)|0; bitcnt++; }
    }


    else if (MEDIUM_DELTA_6US) { // invert 0...1 or 1...0
      if (lastbit==0) { bitstream=(bitstream<<1)|1; syncbits=(syncbits<<1)|0; lastbit=1; bitcnt++; }
      else            { bitstream=(bitstream<<2)|0; syncbits=(syncbits<<2)|0; lastbit=0; bitcnt+=2; }
    }


    else { // 8US // adds 01       0..1..0 (special sync only) or 1..0..1
      if (lastbit==1) { bitstream=(bitstream<<2)|1; syncbits=(syncbits<<2)|0; bitcnt+=2; }
      else            { bitstream=(bitstream<<2)|0; syncbits=(syncbits<<2)|3; bitcnt+=2; }
    }

Routine for reading data, which switches off SysTick interrupts and enables interrupts for the index pulse and data stream...

C:
  index_detected=0;          // reset variable - can only be set in interrupt routine
  while (index_detected==0); // wait for index pulse

  FTM0_SC   = FTM_SC_CLKS(1) | (PRESCALE); // system (48MHz) clock + no prescale
  FTM0_CNT  = 0;

  // disable SYSTICK interrupt by setting the interrupt enable bit to zero
  SYST_CSR = SYST_CSR_CLKSOURCE; // disable SYSTICK interrupts

  clearpending_FDD_stream_decode_ISR();
  enable_FDD_stream_decode_ISR();

  index_detected=0;          // reset variable - can only be set in interrupt routine
  while (index_detected==0); // wait for index pulse

  disable_FDD_stream_decode_ISR();
  clearpending_FDD_stream_decode_ISR();
  FTM0_SC   = FTM_SC_CLKS(0) | (PRESCALE); // disable clock + no prescale


  // re-enable the SYSTICK interrupt by setting the interrupt enable bit to zero
  SYST_CSR = SYST_CSR_CLKSOURCE | SYST_CSR_TICKINT | SYST_CSR_ENABLE; // re-enable SYSTICK
  //

Just noticed my comment says that the system clock is 48MHz.
 
A quick update. I have a Teensy3.1, but it needed to be de-soldered from it's current usage.

I tried the floppy decode/read and the behaviour was the same as observed on the Teensy3.2. Only about 1 read in 10 returns the correct data. Note: this was compiled with the 96MHz overclock.

I did try compiling at 72MHz stock speed, but even the small sector headers were not read correctly.

I toggle a digital I/O at the beginning of the data bit interrupt routine and reset it at the end. I used this to trigger my scope. At 96MHz, the interrupt routine usually takes between 1.1 and 1.5us, but I can see a faint trace suggesting at worst it can take 3us. Note that minimum pulse to pulse time can be 4us for the read-data line from the floppy (plus jitter caused by rotation speed variations). Playing with the "greater than pulse width" trigger setting on the Rigol revealed a max interrupt time of 3.2us. So I think this confirms I need the CPU overclock (or a better decode algorithm!). Note: the interrupt code is in RAM (aka FASTRUN).
 
I am curious if you overclock the F_BUS in (kinetis.h line:764 file) will things improve ?

This may help a bit:
Code:
void setup() {
  // put your setup code here, to run once:
  Serial.begin(115200);
  while (!Serial && millis() < 2000 ); // 2 sec timeout
  CPUspecs()
}

void loop() {
  // put your main code here, to run repeatedly:

}

// ***********************************************************************************************************
// *
// *                            CPU specs
// * Thanks defragster ):
// ***********************************************************************************************************
void CPUspecs() {
  Serial.println();
#if defined(__MK20DX128__)
  Serial.println( "CPU is T_LC");
#elif defined(__MK20DX256__)
  Serial.println( "CPU is T_3.1/3.2");
#elif defined(__MKL26Z64__)
  Serial.println( "CPU is T_3.0");
#elif defined(__MK64FX512__)
  Serial.println( "CPU is T_3.5");
#elif defined(__MK66FX1M0__)
  Serial.println( "CPU is T_3.6");
#elif defined(__IMXRT1052__)
  Serial.println( "CPU is T_4.0 BETA");
  //#elif defined(__IMXRT1062__)
#elif defined(ARDUINO_TEENSY40)// || defined(ARDUINO_TEENSY41) || defined(ARDUINO_TEENSY_MICROMOD)
  Serial.println( "CPU is T_4.0");
#elif defined(ARDUINO_TEENSY41)
  Serial.println( "CPU is T_4.1");
#elif defined(ARDUINO_TEENSY_MICROMOD)
  Serial.println( "CPU is T_4~MICROMOD");
#endif
  Serial.print( "F_CPU =");   Serial.println( F_CPU );
  Serial.print( "ARDUINO =");   Serial.println( ARDUINO );
#if !defined(__IMXRT1062__) // TODO FIX FOR TEENSY 4.0 // !defined(__IMXRT1052__) T4 BETA
  Serial.print( "F_PLL =");   Serial.println( F_PLL );
  Serial.print( "F_BUS =");   Serial.println( F_BUS );
  Serial.print( "F_MEM =");   Serial.println( F_MEM );
#endif
#if defined(__IMXRT1062__) // TEENSY 4.x
  Serial.println( "TEENSY 4.0~4.1 F_CPU_ACTUAL is a uint32_t that gives the current CPU speed in Hz");
  Serial.print( "F_CPU_ACTUAL =");   Serial.println( F_CPU_ACTUAL );
  Serial.print( "F_BUS_ACTUAL =");   Serial.println( F_BUS_ACTUAL );
#endif
  Serial.print( "NVIC_NUM_INTERRUPTS =");   Serial.println( NVIC_NUM_INTERRUPTS );
  Serial.print( "DMA_NUM_CHANNELS =");   Serial.println( DMA_NUM_CHANNELS );
  Serial.print( "CORE_NUM_TOTAL_PINS =");   Serial.println( CORE_NUM_TOTAL_PINS );
  Serial.print( "CORE_NUM_DIGITAL =");   Serial.println( CORE_NUM_DIGITAL );
  Serial.print( "CORE_NUM_INTERRUPT =");   Serial.println( CORE_NUM_INTERRUPT );
  Serial.print( "CORE_NUM_ANALOG =");   Serial.println( CORE_NUM_ANALOG );
  Serial.print( "CORE_NUM_PWM =");   Serial.println( CORE_NUM_PWM );
  Serial.println("");
}
 
Wow! Great. Looks like a really useful little program. Many thanks.
For some reason I had problem with Serial starting so I removed the millis() part.

I'm pretty sure my board is a Teensy3.0. It has no text indicating this on the bottom of the PCB (nether does the 3.1), but the 3.2 does have this confirmation of board type. Thankfully I added stickers to the top of the CPU when I bought the boards.

Teensy3.0 (96MHz overclock)
CPU is T_LC
F_CPU =96000000
ARDUINO =10813
F_PLL =96000000
F_BUS =48000000
F_MEM =24000000
NVIC_NUM_INTERRUPTS =46
DMA_NUM_CHANNELS =4
CORE_NUM_TOTAL_PINS =34
CORE_NUM_DIGITAL =34
CORE_NUM_INTERRUPT =34
CORE_NUM_ANALOG =14
CORE_NUM_PWM =10


Teensy3.1 (96MHz overclock)
CPU is T_3.1/3.2
F_CPU =96000000
ARDUINO =10813
F_PLL =96000000
F_BUS =48000000
F_MEM =24000000
NVIC_NUM_INTERRUPTS =95
DMA_NUM_CHANNELS =16
CORE_NUM_TOTAL_PINS =34
CORE_NUM_DIGITAL =34
CORE_NUM_INTERRUPT =34
CORE_NUM_ANALOG =21
CORE_NUM_PWM =12
So there are more interrupts which could be possible sources of the problem (clutching at straws).
 
Well there is issue with the code actually.
Code:
#if defined(__MK20DX128__)
  Serial.println( "CPU is T_3.0"); // FIX ME ):
#elif defined(__MK20DX256__)
  Serial.println( "CPU is T_3.1/3.2");
#elif defined(__MKL26Z64__)
  Serial.println( "CPU is T_LC");
 
It's been a while not sure but I think all Teensy 3.0 had black PCB?

TEENSY 3.0.jpg
 
Most but not all Teensy 3.0 had black solder mask. Some of the earliest Teensy 3.1 were also black.

The board in this photo is definitely Teensy 3.0. The lighting doesn't quite catch the main chip well, but it is readable "MK20DX128VLH5" with adjusting the brightness.

1701742449509.png
 
I would appreciate any pointers (other postings, web sites) as to how to identify all sources of interrupts and the code needed to disable them. The MK20 SoC is a complicated and sophisticated architecture and I am no expert. It's significantly more complex than the good old 8031 :) .

My idea is to disable all interrupts (during disk data reading), apart from the two which are needed, the index pulse interrupt and the read-data interrupt. I already disable the SysTick interrupt, which at the time I created this project, I had zero knowledge about. I am now reasonably sure on the Teensy3 it is used as the milliseconds timer tick interrupt.

I do have the PDF of the "K20 Sub-Family Reference Manual", but it is a big read.

These are my two needed interrupts.

C:
// interrupt on the rising edge of the index pulse
  attachInterrupt(fromFDDindexPin,FDD_index_ISR,RISING);
  attachInterrupt(fromFDDreadPin, nullptr, RISING); // just using this to setup the CHANGE interrupt trigger
  attachInterruptVector(IRQ_PORTD, FDD_stream_decode_ISR); // this is the ISR that will be run
 
Before you disable other interrupts, perhaps first try NVIC_SET_PRIORITY(IRQ_PORTD, 0);

0 is the highest priority, 255 is the lowest. By default, various interrupts are assigned 32 to 128, so setting 0 will allow it interrupt any others and no other can interrupt it.
 
I have done some further investigation, looking at the raw data captured from the FTM timer module.

I also compared a good data stream against the place where it was corrupted. I tried to line up the re-constructed floppy read pulses, but when I did I found the corrupted data needed an extra pulse/interrupt. Once I printed the raw values I am seeing that the interrupt sometimes returns a timer count of ZERO, which of course makes no sense.

I sort of recall a posting (which I can't now find) about an extra opcode needed at the end of an interrupt for some reason again I can't recall.

The interrupt code (start of) is as follows. It stops the timer, reads it, resets it and starts it again. I can't see how a value of zero could be generated.

C:
FASTRUN void FDD_stream_decode_ISR(void) {
  uint8_t timer;
  uint32_t isfr = PORTD_ISFR;
  PORTD_ISFR = isfr;

  FTM0_SC  = FTM_SC_CLKS(0) | (PRESCALE);  // STOP TIMER
  timer    = (uint8_t)(FTM0_CNT&0xff);     // READ COUNT
  FTM0_CNT = 0;                            // RESET COUNT
  FTM0_SC  = FTM_SC_CLKS(1) | (PRESCALE);  // RESTART TIMER

The code above came from
link to forum article on capturing pulses
 
C:
const byte fromFDDindexPin = 22;
const byte fromFDDreadPin  = 21; // PORT-D IRQ #43


const byte fromFDDreadPinIRQ  = 43; // IRQ_PORTD = 43,   // kinetis.h (Teensy 3.0)
 
C:
NVIC_SET_PRIORITY(fromFDDreadPinIRQ, 0);   // HIGHEST priority

I am pretty sure that during the track/sector read only the index pulse and read-data can generated interrupts. I switched off the SysTick. Other sources of interrupt, well I'm unsure. I haven't disabled USB interrupts, but this doesn't appear to cause a problem with the Teensy3.0

I reviewed the startup code in the ".../teensy/avr/cores/teensy3" folder, and I could only see once difference between the 3.0 and 3.1/2 code, something to do with FTM2 settings. In any case I don't think FTM2 is ued/enabled.

I am now going away from the idea of a spurious interrupt due to an unknown enabled interrupt.
 
You can only set the priority for the whole port not just one pin.
Code:
void setup() {
  // put your setup code here, to run once:
  NVIC_SET_PRIORITY(IRQ_PORTD, 0);
}

void loop() {
  // put your main code here, to run repeatedly:
}
 
PRIORITY for USB defaults to 112, the hardware serial ports default to 64, and systick defaults to 32 (was 0 long time ago).
 
Apologies. Please ignore my "read ZERO" from the FTM. It was a bug in my debug routine in the ISR which records values in an array.
 
Hey I am just curious is there any more information about your project somewhere for the Atari ST disk drive ?
Just asking because I am big fan of Atari computers.
Here is one of my projects for Atari Falcon 030 FDI+ Work In Progress !!!
Sorry. I don't have a blog or a GitHub. Nevertheless I am very happy to share the code with anyone interested.
If you want deeper details/information, probably best to PM me and I will answer ASAP.

I kept many of my Atari ST floppies from the 80's. I have now been able to recover much of my old assembler, GFBasic and C source code. Nothing special, but I may resurrect one project. I am kicking myself because I threw out (I do not remember selling) my Lattice C, Lattice Pascal, DevPac ST and GFBasic packages and documentation. Annoyingly I clearly did not back up the floppies :-( .

I'll post a picture of my lash up later. My phone battery just died.
 
Interesting.
Here is my (Atari HD brick) HD ACSI implementation with Media Transfer Protocol.
ATARI STE HD.jpg

I'm pushing the limits of the Atari ST/e DMA speed, based on Teensy 4.1. :D
 
I sort of recall a posting (which I can't now find) about an extra opcode needed at the end of an interrupt for some reason again I can't recall.

Maybe <this> is the post that you are recalling ?? This may or may not apply, as the post recommends adding the "DSB" command at the end of an interrupt function when using the T4.x (which is much faster than the T3.x).

Mark J Culross
KD5RXT
 
Maybe <this> is the post that you are recalling ?? This may or may not apply, as the post recommends adding the "DSB" command at the end of an interrupt function when using the T4.x (which is much faster than the T3.x).

Mark J Culross
KD5RXT
Yes, thank you. That was the posting.

I added the asm command, which was after a digitalWriteFast(), so it did follow a hardware write. Unfortunately the problem has not been fixed. :-(
 
I don't know if I am wasting my time by going down a long rabbit hole....

I am intrigued as to what the problem could be, and my curiosity is to investigate it, even though I have a work flow using the Teensy3.0 and FTDI module for serial transfer.

While developing the code to decode the data stream, I added debug code to capture timer values into an array. After seeing the index pulse, I capture a stream of timer values which corresponds to the rising edge-to-edge of the floppy data pin. They are stored in an array.

Just playing with the Teensy3.0 right now, the timer values captured are amazingly repeatable, given that we are dealing with a mechanical motor to rotate the floppy. I bin timer values into 4us, 6us or 8us pulse "bins". Due to re-writing of sector data, there can be runt pulses (*), but they get filtered out/ignored while searching for the missing clock synchronisation data stream. Even the runt pulses return the same values (outside the expected 4/6/8us).

Consequently Teensy3.0 disk read/decode is 100% reliable.

Changing to a Teensy3.1 the timer values just don't seem as repeatable. I have zero idea why.

Investigating a sector containing only 0x4e values, which gets corrupted, revealed that one pulse which should have binned as 8us, binned as 6us and resulted in the decode stream getting confused. All the surrounding pulses for each 0x4e databyte were correct. Why one 8us pulse should return a value of 6us, when all surrounding measurements were as expected makes no sense. It is as if 2us was lost from the data stream.

Using a scope to look at the floppy read data-stream into the Teensy, I can't see any problems with it. The floppy data is buffered by an open collector gate with pull-up on the teensy input, which acts as a level converter. For the 3.1 I use 3.3V for the pull-up. For the 3.1 I tried both 3.3V and 5V for the pull-up but apart from making the pulse taller (as expected), made no difference to the decode. I believe 3.0 is 3.3V only, but 3.1 is 5V tolerant.
 
By default pinMode configure the pin with a feature called slew rate limiting maybe that's the problem ?
Code:
 void setup()
{
  pinMode(?, OUTPUT);
  CORE_PIN?_CONFIG &= ~PORT_PCR_SRE; // turn off the slew rate limit
}

EDIT: Not sure if this will work for input instead of output ?
 
Last edited:
Thank you for your suggestion Chris. I tried adding the suggested change, as follows.

C:
// slew rate
  CORE_PIN11_CONFIG |= PORT_PCR_SRE;
  CORE_PIN12_CONFIG |= PORT_PCR_SRE;
  CORE_PIN14_CONFIG |= PORT_PCR_SRE;
  CORE_PIN15_CONFIG |= PORT_PCR_SRE;
  CORE_PIN16_CONFIG |= PORT_PCR_SRE;
  CORE_PIN23_CONFIG |= PORT_PCR_SRE;
  CORE_PIN22_CONFIG |= PORT_PCR_SRE;
  CORE_PIN21_CONFIG |= PORT_PCR_SRE;

I compiled and ran. I then used the 'a' command which tries to read a single sector. It did seem to improve but not 100% fix the reliability. I attempted 20 reads.

I removed the above code, recompiled, and tried again, once more 20 reads.

I went back and forth again.

With slew rate (try 1) 8/20 good reads.
Without slew rate (try 2) 3/20 good reads.
With slew rate (try 3) 7/20 good reads.
Without slew rate (try 4) 1/20 good reads.

I would have expected slew rate to only affect outputs and not inputs. During the data read phase all outputs are static.

So the result is IMHO unexpected, but a clue as to where to look.

I did try commenting out each of the lines above, and some appeared to make no difference, and some appeared to be needed for an improvement. I now have to go away and confirm if I have understood the pin mapping numbers correctly. The numbers above correspond to the digital pin numbers. The last three lines are the numbers for the digital inputs (track 0 sensor, read data and index pulse). They did seem to be needed to improve reliability. Each time I try something I get slightly different results making it difficult to see a pattern.

More work. I guess it's further down the rabbit hole I go :-( .
 
Back
Top