Teensy 4.1: release thread from interrupt

But ... if you need to restore the NVIC IRQ enable flag to its previous state, you need to read it and then modify it ... and an interrupt could occur between those two accesses resulting in the read-back value being stale.
How?
In this unlikely situation where both a thread and an interrupt are messing with an IRQ's state, they can't overlap execution - the interrupt will interrupt the thread, do its thing (returning the IRQ to whatever state it was upon entry) and then the thread will continue none-the-wiser. The IRQ state won't be "stale" as long as the interrupt handler puts it back how it found it.
 
I'm thinking about the likelihood of an ISR messing with an enable flag that doesn't belong to it. You just know users will do it ... like doing serial output from an ISR.

Or maybe we just leave that sort of user to the consequences of their actions.
 
Is it undefined? Or is it as ARM documents with strongly ordered memory?
LDREX/STREX are only guaranteed to work with memory-type=normal, not device or strongly-ordered. In practice it's down to how it's connected to the CPU; peripherals on an AXI bus have a reasonably good chance of working.

The case where strongly-ordered memory isn't guaranteed to work atomically is when it's used with instructions that read/write multiple values, e.g. ldm/stm/ldrd/strd/etc. because they can be interrupted in the middle and may restart from the beginning when they are resumed, which can result in the same memory location being accessed multiple times (which could obviously cause problems with memory-mapped IO ranges that have side-effects, like clearing flags when they are read or triggering hardware operations when they are written).
 
I've had a bit of a play, and a look at some documentation, and can't see anything to contradict the above - it can't be done [guaranteed properly].

As an immediate example where IRQ disabling appears to be done wrong, consider this edited snippet, from usb_serial.c:
C:
static void rx_queue_transfer(int i)
{
    NVIC_DISABLE_IRQ(IRQ_USB1);
    // ...
    NVIC_ENABLE_IRQ(IRQ_USB1);
}

// read a block of bytes to a buffer
int usb_serial_read(void *buffer, uint32_t size)
{
    // ...
    NVIC_DISABLE_IRQ(IRQ_USB1);
    while (count < size && tail != rx_head) {
    // ...
        if (avail > len) {
            // partially consume this packet
            // ...
        } else {
            // fully consume this packet
            // ...
            rx_queue_transfer(i);
        }
    }
    NVIC_ENABLE_IRQ(IRQ_USB1);
    return count;
}
Oh deary deary me...
 
I've had a bit of a play, and a look at some documentation, and can't see anything to contradict the above - it can't be done [guaranteed properly].

As an immediate example where IRQ disabling appears to be done wrong, consider this edited snippet, from usb_serial.c:
C:
static void rx_queue_transfer(int i)
{
    NVIC_DISABLE_IRQ(IRQ_USB1);
    // ...
    NVIC_ENABLE_IRQ(IRQ_USB1);
}

// read a block of bytes to a buffer
int usb_serial_read(void *buffer, uint32_t size)
{
    // ...
    NVIC_DISABLE_IRQ(IRQ_USB1);
    while (count < size && tail != rx_head) {
    // ...
        if (avail > len) {
            // partially consume this packet
            // ...
        } else {
            // fully consume this packet
            // ...
            rx_queue_transfer(i);
        }
    }
    NVIC_ENABLE_IRQ(IRQ_USB1);
    return count;
}
Oh deary deary me...
Are you saying it’s done wrong because it doesn’t reenable the previous IRQ state?
 
Neither function re-enables the previous IRQ state. And usb_serial_read() attempts to disable the IRQ while it fills the buffer, but then may call rx_queue_transfer() which enables it. At best, this is surely not as intended. At worst, it'll result in a bug that's very intermittent and hard to track down.
 
Last edited:
Hi everybody, apologies for the delay in responding. We were out of town and occupied.

Per the request for a benchmarks for timing, see: Teensy 4.0/UNO R4 timing (github)

This is code that I threw together to check timing. It outputs average and maximum timings using the cycle counter, but in practice I use it with an oscilloscope. See the subdirectory "results". With a large number of iterations, one might hope that the minimum is close to the average, but admittedly that is an assumption and perhaps I should output that number as well.

For the Teensy: Digital read is solid at 26 cycles (43nsecs). Digital write is 12 to 29 cycles (20 to 48nsecs). Interrupt latency for pins (or specifically the pin used in the test), is an average 119 cycles to max 145 (198 to 242nsecs, 25% variation). But skipping the api and connecting to the interrupt directly brings it down to 66 to 71 (110 to 118nsecs, 10% variation).

(The latency may seem slow, but that may be due to the larger number of registers that need to be saved. I recall in DSPs where we have direct control of the context switch, we would save the minimum set of registers required for a given isr. That number 66 cycles, for the ARM saving everything might be close to as expected.)

For a comparison, the UNO R4: Digital read is 72 to 204 cycles (1.5 to 4.3usec), write is 21 to 273 cycles (0.44 to 3.6usec), latency for a pin interrupt is 182 to 312 cycles (3.8 to 6.5usecs) and SPI transfer16 takes 327 to 464 cycles (6.8 to 10usecs).

And for a practical application: The following is an example of a project that uses timing on the above scale for the Teensy. It implements a state machine to operate the hamamatsu ccd, S11639-01 with Teensy 4 (github)
 
Last edited:
Timers aren't limited to 16 bits; quad timers are 16-bits each with four timers in each module that can be cascaded, giving a maximum total of 64-bits.
Could you please show an example how to cascade two FlexPWM without using an ISR? That would be sooo helpful.

Thank you
 
You don't need the interrupt / cpu intervention. That timer can be routed through the crossbar to trigger SPI directly.
Then SPI can be read by DMA to fill a buffer, only requiring CPU intervention when the buffer is full (and automatically switching to a new buffer to give the CPU plenty of time to process the old one).

Wow, that would be fantastic. It is a really generic problem, just about every ADC and DAC that we might use needs this. I asked here and NXP and was told it is not possible.

Can you please show an example using the flexpwm and SPI without an ISR?

The simplest example would be sufficient: At 1.0 usec intervals, output a 300 usec pulse and start a single 16 bit SPI on the tailing edge.

Actual ADCs are little bit more complicateed, but we can implement any of them if we can do the above.

Thank you.
 
I've told you what can be done, I'm not going to spend my personal time working on something I'm not interested in for free.
 
I've told you what can be done, I'm not going to spend my personal time working on something I'm not interested in for free.

Well.. the reason I asked, is that NXP's support engineer told me it can't be done, i.e. starting the SPI from the flexpwm without an ISR.

We talked about this quite a bit and eventually I filed a support ticket to ask how to do it. That took a little while too, but finally the above answer came back. The only way to start the spi from the flexpwm is by means of an ISR. The same seems to be for other functions, setting a pin, starting another flexpwm directly and it seems also on the NXP forum these questions are not answered. But my recollection could be off,

So, in other words, it seems that connecting the FlexPWM to functions outside of driving pulses with that FlexPWM seems to require an ISR, and that makes all of it susceptible to ISR latency and other interrupts.

The FlexPWM manual seems to paint the picture that the use case they focused on is driving a stepper motor. So that makes sense.

Do we have a diagram for the base software on the Teensy?

For example : What else is there running in the background besides systicks? How is Serial implemented? What resources or peripherals are reserved to the base software?
 
Last edited:
Do we have a diagram for the base software on the Teensy?
Not that I know of
For example : What else is there running in the background besides systicks? How is Serial implemented? What resources or peripherals are reserved to the base software?
No other interrupts unless you use peripherals. When you run a blink example, SysTick is the only interrupt. If you use USB Serial you can get relatively long interrupt disable during transmit, is my understanding. UART serial is interrupt driven, with very short interrupt disables.
 
For a start I specifically said QUAD TIMER, not FlexPWM. Then you use the crossbar to route the timer's output to the appropriate LPSPI_INPUT_TRIGGER.
 
For a start I specifically said QUAD TIMER, not FlexPWM. Then you use the crossbar to route the timer's output to the appropriate LPSPI_INPUT_TRIGGER.

It seems like the linear CCD would need a combination of the FlexPWM and Quad Time and I am not sure that it is possible.

In the following I connect ΦΜ, ICG and SH, to pins 4,5,6, all on FlexPWM2 submodules 0-2, all of that seems straightforward. For the ADC, the CNVST on pin 10 and the SPI are run from ISRs connected to interrupts on FlexPWM2 submodule 3, which is synchronized to and runs at four times the period of ΦΜ.

The code is here TCD1304Device2.h (header only library for the TCD1304 using the FlexPWM)

Now, the Quad timer might take the place of submodule 3, and count every four pulses from ΦΜ on FlexPWM2 submodule 0 and start the LPSPI. that would be fantastic! But, something also has to asert the CNVST at just the right moment and for 700 nanosecs. And, the SPI has to start 30 nsecs after that. The timing for the first effects the precision of the measurement. The timing for the second effects whether all 16 bits can be transferred.

At the momement it is working, with some compromise to not run at the maximum speed theoreticlaly supported by the sensor, adc and spi.

From the TCD1304 datasheet:

1763922445842.png



After the above, the rising edge of the ICG, the data appears every four clocks. The readout sequence looks like this:

1763922846816.png








1763922495136.png


1763922543694.png
 
For a start I specifically said QUAD TIMER, not FlexPWM. Then you use the crossbar to route the timer's output to the appropriate LPSPI_INPUT_TRIGGER.
Yes, it seems I keep missing the fine print. But I am not clear about how it would work. I think that is why I miss the fine print.

Can we start the Quad Timer from the FlexPWM? And vice versa, can we start the FlexPWM from the Quad Timer?
 
Back
Top