Teensy 4.1: release thread from interrupt

P/S The teensy code does use LDREX/STREX in a few places. For example delay() calls micros() and micros() uses LDREXW and STREXW.

And there is the occasional "TODO: use STREX to avoide interrupt disable"

So I would like to add the code in Serial.available to the TODO list. Maybe one of us can take a look at it, to give Paul a break.
 
If your application has a requirement that can't tolerate a 1-us delay in responding to an IRQ, then no, you can't do Serial.

Not true at all - if you do it the right way. And there are a few right ways to do it.

This is not a good excuse for disabling interrupts where an atomic flag will do the job.
 
Not true at all - if you do it the right way. And there are a few right ways to do it.

This is not a good excuse for disabling interrupts where an atomic flag will do the job.
I just meant you can't do it Serial on T4. You are mistaken that the calls to disable_irq() can be replaced by use of those flags.
 
Turning off interrupts in T4 serial is not a programming choice, i.e. it could not be avoided via atomic memory operations

That is obviously not true as a general statement. It depends on what. Paul does it in delay.c, and he has TODO notes saying exactly that in EventResponder.c

And that is part of why the atomic operations exist.

Others would do it for the case we have been discussing. I probably will do it, if someone doesn't get it to before me, which I hope they do, because I am falling behind on other work.
 
@joepasquariello

Are you assuming that I am using one of the thread environments mentioned above? I am not, yet.
Well sometimes you do need to use a delay, when the overhead for a timer and context switch is longer than the delay or you want to hold the context.
Even without threads, you never have to call delay(). You can always do this instead, which just means you'll spin within loop instead of spinning within delay(), but it would allow you to check on other things more often.

Code:
elapsedMillis ms;
void setup() {
}
void loop() {
  if (ms >= DELAY_MS) {
    ms -= DELAY_MS;
    // do whatever
  }
}
 
just meant you can't do it Serial on T4. You are mistaken that the calls to disable_irq() can be replaced by use of those flags.

Well, the we respectfully disagree. I have already done exactly this umpteen times in other platforms.

But maybe you are saying there is something in the existing code that presents and obstacle? It is not obvious.

The only issue that comes to mind if the isr that services the port does the copying to the buffer rather than a DMA or something similar.
 
@joepasquariello It is not immediately clear from the reference manual that there is a receive interrupt. But there seem to be example of code using it for our MCU, in the NXP community. If so, that would be pretty easy; receive character, increment flag. Never a lost character except for overrun or some other thread disabling global interrupts and easily avoided.
 
Here, I think this might offer a good solution, still within the arduino frameword and without resorting a full blow rtos.

The following are about implementing a spin lock in ARM7 and the NXP in particular.



The proposal is to implement a simplified sort of mutex or semaphore for use between things from loop and ISR's. One such being Serial.available() and the ISR that reads characters onto the usb serial buffer. Rather than loop over __disable_irq() and checking the buffer, the ISR increments a counter and Serial.available() checks the counter and only goes further calling __disable_irq() and fetching characters, when there is something to fetch.

That alone might clean up a host of behaviors, all without having to resort to a full blown RTOS.

Should we start a new topic thread for this?

I’d bet that the C++ primitives, eg. std::mutex and the atomics (and semaphores from C++20), use these instructions.
 
Got it! Eureka!

There is a character match interrupt. Perfect. We interrupt on cr or lf.

Here is an article

 
Just to reiterate, I checked again the latest NXP driver for LPUART, and it disables interrupts in a number of places, as shown below. I find it a little hard to believe that this would be the case if there was an alternative.

Code:
  /* Disable and re-enable the global interrupt to protect the interrupt enable
  * register during read-modify-write. */
  irqMask = DisableGlobalIRQ();
  /* Re-enable LPUART RX IRQ. */
  base->CTRL |= (uint32_t)(LPUART_CTRL_RIE_MASK | LPUART_CTRL_ORIE_MASK);
  EnableGlobalIRQ(irqMask);
 
No, it is not a proof, sadly. I have seen worse from vendors, in code and silicon and datasheets too.

But back to the device at hand. So far the only interrupts that I notice in the documentation are for byte compares, and maybe i saw something for the fifo, normally a receive fifo would have at least half full and overrun. However, an interrupt character, as noted, would be pretty good for a command interface if the fifo is large enough.
 
No, it is not a proof, sadly. I have seen worse from vendors, in code and silicon and datasheets too.

But back to the device at hand. So far the only interrupts that I notice in the documentation are for byte compares, and maybe i saw something for the fifo, normally a receive fifo would have at least half full and overrun. However, an interrupt character, as noted, would be pretty good for a command interface if the fifo is large enough.
If you are able to modify HardwareSerial to replace disable_irq() with something more clever, that's great. If not, would any of these help?
  • simple interrupt-driven LPUART with no FIFOs or circular buffers
  • polled LPUART
  • use SoftwareSerial instead
 
I’d bet that the C++ primitives, eg. std::mutex and the atomics (and semaphores from C++20), use these instructions.
You can't use the C++ threading primitives on Teensy. They have to be implemented (e.g. like all the other stuff provided by newlib) by a kernel since you can't block on a mutex, semaphore, etc. without some sort of context/threading system including an idle thread that can always run when everything else is blocked. Newlib doesn't attempt any of that so they throw a compiler error.

The atomics are available and do use ldrex/strex, although they also include extra steps (emitting barriers) designed for multi-core CPUs that are unnecessary in this case.
 
Haven't looked at it but I would guess it's to protect against someone using Serial.print in an ISR, which could trigger in the middle of updating some of the buffer tracking variables and end up corrupting them. Disabling the serial interrupt alone wouldn't be enough to protect against this since it could potentially be caused by any other interrupt.

IMO there's nothing wrong with using disable_irq() as long as whatever work is performed has either a fixed or maximum time it will take to complete, i.e. no loops that will spin waiting for a hardware condition to trigger. A CPU core like the T4 will always have interrupt handling jitter due to its nature (interrupt priorities, predictive branching and memory caching...), that's one of the reasons that justifies the existence of so many hardware peripherals - having to interrupt the CPU to take care of something should be a "last resort" when it can't be handled any other way.
 
Haven't looked at it but I would guess it's to protect against someone using Serial.print in an ISR, which could trigger in the middle of updating some of the buffer tracking variables and end up corrupting them. Disabling the serial interrupt alone wouldn't be enough to protect against this since it could potentially be caused by any other interrupt.
Is there a way to make atomic the read-modify-write, so the disable_irq() would not be necessary? I know this would be conjecture on your part, but does it make sense to you that NXP would do the same thing in their LPUART driver, for the same reason, and not use the atomic operation if it was available?
 
T_4.0 Beta introduced a use of __LDREXW / __STREXW in uint32_t micros(void) to get sub-millis timing for usable micros.

It is only doing a check for interrupt across a READ of 'shared' millis_count and not doing any external data change.

Some few things and places across the CORES code disables interrupts. Interrupts are not LOST - only queued until re-enabled with the encompassed 'delay' during that 'brief' time.
 
It's easy to do it for a single value (that's what ldrex/strex are for) but significantly harder to atomically update multiple values.
 
You can't use the C++ threading primitives on Teensy. They have to be implemented (e.g. like all the other stuff provided by newlib) by a kernel since you can't block on a mutex, semaphore, etc. without some sort of context/threading system including an idle thread that can always run when everything else is blocked. Newlib doesn't attempt any of that so they throw a compiler error.

The atomics are available and do use ldrex/strex, although they also include extra steps (emitting barriers) designed for multi-core CPUs that are unnecessary in this case.


Thank you
 
Some few things and places across the CORES code disables interrupts. Interrupts are not LOST - only queued until re-enabled with the encompassed 'delay' during that 'brief' time.

That time delay is the problem. It also accounts for the jitter in interrupt latency that we discussed in another thread. And that is s severe limitation. You simply can't do hard real-time like that. We really aught to hunt all of these down, or all of those that might show up routinely as in serial, and fix them, and add conspicuous warnings to the documentation of all the others.
 
Back
Top