Teensy 4.1: release thread from interrupt

On STM32, I used ChibiOs. It supports a lot of STM32 peripherals with kernel functions, and a lot of familly members. But no port for Teensy.
On Teensy, I used Freertos with no problems. I used the port from tsandmann.

One of the problem with interrupt routines in core libraries, they lack some kind of weak external hook, which is called at the end of the IRQ function. I only found one on the can library.
Using this, you call your own function which put the received byte or frame in a message queue, which is awaited by a thread. The IRQ return, and the kernel switch context to the awaiting thread. This is more efficient than calling yield() in all threads and checking if bytes are available from devices.
 
Last edited:
It is really very simple.

it goes something like this, as I recall (it has been a while):

For scheduling it's context, stack, and task list. For interrupts it's lookup table, context and stack. For semaphores it's atomic memory access, task list or stack, context and stack.

I may have forgotten something, but it has all been around for a very long time. Making it more complicated doesn't often make it better. Realtime platforms are inherently simple, and need to be.
 
TeensyAtomThreads Library and Example Sketch

The ZIP file contains the latest files from A-Dunstan's TeensyAtomThreads repository, plus two files I created, library.properties and readme.txt. You should unzip to folder libraries/TeensyAtomThreads.

The INO file is a minimal example with a thread that blinks the built-in LED. Tested on T4.1 using Arduino 1.8.19 and TeensyDuino 1.59.

Here are some links. The first one is A-Dunstan's port for Teensy. The second is the github of AtomThread's author Kelvin Lawson. It has lots of test programs for the kernel, semaphores, queues, etc., though not designed for Arduino, so you'll have to worth through them and cut and paste. The third is an introduction to AtomThreads on the India Institute of Technology site is a good into. The fourth, atomthreads.com, has good docs.

https://github.com/A-Dunstan/TeensyAtomThreads
https://github.com/kelvinlawson/atomthreads
https://labs.dese.iisc.ac.in/embeddedlab/introduction-to-atomthreads/
https://atomthreads.com/
 

Attachments

  • AtomThreadsTest.ino
    1.2 KB · Views: 20
  • TeensyAtomThreads2025-11-13.zip
    46.2 KB · Views: 21
Wonderful! At last! It looks really hopeful.

So, my first naive thought is loop() becomes a thread. Yes?

Seems very simple, see main.cpp. Self explanatory.

What happens if we have a main.cpp in our sketch directory?

Does that supplant the one from cores/teensy4 ?
 
Wonderful! At last! It looks really hopeful.
So, my first naive thought is loop() becomes a thread. Yes?
Seems very simple, see main.cpp. Self explanatory.
What happens if we have a main.cpp in our sketch directory?
Does that supplant the one from cores/teensy4 ?
Did you look at the example I posted? It starts the OS from setup(), and as long as that doesn't fail, your code will never get to loop(). Everything would instead go into threads that you create. AtomThreads is not designed for Arduino, so the example programs don't have setup/loop. They just have a main() entry point, which Arduino hides.

Just add the TeensyAtomThreads library the same as you would any other. I highly recommend at least reading the introduction (3rd link in previous post). One caveat, too, I have never used AtomThreads, but I think it's a great alternative to FreeRTOS and others, which do require changes to the Teensy core files. I use a cooperative OS, which works well for what I do, and is more compatible with the Arduino way of doing things. Most people insist they need preemption.
 
Last edited:
No, I didn't look yet.

If you need to respond to a command interface, then you might want some sort of listener thread and it will typically be a loop even if it waits for a semaphore from the serial port. Launching the thread anew at each character probably makes less sense.
 
If you need to respond to a command interface, then you might want some sort of listener thread and it will typically be a loop even if it waits for a semaphore from the serial port. Launching the thread anew at each character probably makes less sense.
I don't understand your focus on semaphores. The serial port interrupt takes care of putting received bytes into its RX queue. I would simply create a thread for the command interface. Read what's available from the port when the thread runs, take action when a complete and valid command has been received, and go back to sleep otherwise.
 
I don't understand your focus on semaphores. The serial port interrupt takes care of putting received bytes into its RX queue. I would simply create a thread for the command interface. Read what's available from the port when the thread runs, take action when a complete and valid command has been received, and go back to sleep otherwise

The problem is that available() uses __disable_irq(), which is a macro for "CPSID i" (see imxrt.h) which is an ARM instruction to disable all interrupts for the processor on which it is invoked.

Looping over available(), in an idiot loop no less, does that needlessly and more or less continuously. Very not good.

Posting a semaphore instead, means that _disable_irq() is not happening when there is nothing in the buffer.

And really, __disable_ireq() is not a good way to synchronize a buffer.

This is similar to one of the scenarios for which semaphores were invented.
 
The INO file is a minimal example with a thread that blinks the built-in LED. Tested on T4.1 using Arduino 1.8.19 and TeensyDuino 1.59.
With TeensyAtomThreads you don't need to do all the setup stuff, it's already taken care of by the time setup() is run. It uses some clever tricks to create the initial thread during startup using the default stack. So you can just use setup() and loop() like normal, all you have to add is the #include for the header file.

This is a really simple example that waits for pin 18 to go low:
Code:
#include <TeensyAtomThreads.h>

ATOM_SEM pin_sema;

static void p18_down(void) {
  // this function is called from IRQ context; must wrap any atom calls between atomIntEnter/atomIntExit
  atomIntEnter();
  atomSemPut(&pin_sema);
  atomIntExit(0);
}

void setup() {
  Serial.begin(0);
  while (!Serial);

  pinMode(18, INPUT_PULLUP);

  // create semaphore, initial count 0, limit maximum count to 1
  atomSemCreateLimit(&pin_sema, 0, 1);

  attachInterrupt(18, p18_down, FALLING);

}

void loop() {
  // block for up to 3 seconds waiting for pin
  uint8_t status = atomSemGet(&pin_sema, 3*SYSTEM_TICKS_PER_SEC);
  switch (status) {
    case ATOM_OK:
      Serial.println("Pin went low within 3 seconds");
      break;
    case ATOM_TIMEOUT:
      Serial.println("Timeout waiting for pin to go low");
      break;
    default:
      Serial.printf("Something odd happened: %d\n", status);
      break;
  }
}
If you actually run this it will likely trigger the "Pin went low..." message multiple times due to bouncing.

If you wanted to wait for multiple events at the same time you'd have to use a queue instead, to know which event actually occurred.
 
With TeensyAtomThreads you don't need to do all the setup stuff, it's already taken care of by the time setup() is run. It uses some clever tricks to create the initial thread during startup using the default stack. So you can just use setup() and loop() like normal, all you have to add is the #include for the header file.
Oh, I had no idea. Does this mean that setup/loop are in that initial thread? At what priority?
 
The problem is that available() uses __disable_irq(), which is a macro for "CPSID i" (see imxrt.h) which is an ARM instruction to disable all interrupts for the processor on which it is invoked.

Looping over available(), in an idiot loop no less, does that needlessly and more or less continuously. Very not good.

Posting a semaphore instead, means that _disable_irq() is not happening when there is nothing in the buffer.

And really, __disable_ireq() is not a good way to synchronize a buffer.

This is similar to one of the scenarios for which semaphores were invented.

C++20 includes std::counting_semaphore if you feel like switching the compiler version. In C++17, an implementation might use a std::mutex with a condition variable. The excellent Abseil library includes one too, I think. It works on embedded systems and provides modern C++ things for systems that use older C++ versions.
 
Well. in that case, it would be good to cleanup the __disable_irq() stuff in the serial port api.

Is there anything else that is looping over disabling interrupts?

That migh be aprt of why interrupt latencies have so much jitter in general.

it would be good to find all of that kind of stuff and fix it.
 
C++20 includes std::counting_semaphore if you feel like switching the compiler version. In C++17, an implementation might use a std::mutex with a condition variable. The excellent Abseil library includes one too, I think. It works on embedded systems and provides modern C++ things for systems that use older C++ versions.

Really? Is that implemented in the Teensy?

Anyway, I'm usually most interested in the simplest lightest possible solution to any problem, even more so here.
 
It’s also possible to disable specific interrupts, eg. serial port interrupts, so you don’t need to disable all of them with disable-irq.
 
It’s also possible to disable specific interrupts, eg. serial port interrupts, so you don’t need to disable all of them with disable-irq.

It is still not the right way to do it. Disabling interrupts is for something else.
 
Really? Is that implemented in the Teensy?

Anyway, I'm usually most interested in the simplest lightest possible solution to any problem, even more so here.

Yes. It’s about what the compiler vendor provides. In this case, an ARM-supplied version of GCC. I’m fairly certain that the implementation uses all the proper atomic instructions, etc.

Myself, I use some C++20 features on the Teensy with compiler version 14.2.1. (There’s no Intel Mac version of 14.3.1, sadly.) But I use PlatformIO and not the Arduino IDE, so it’s simple to change all these versions.

Apologies if you’ve mentioned this above, but I’m curious, do you use PlatformIO or the Arduino IDE?
 
My preferred development environment is gcc, make and emacs, and sometimes a degugger with ICE.

I've been using the Arduino IDE, but with some hesitancy. Too much of the low level code is not very good.

I do not see myself ever using something like plaformio. The design philosophy seems to be 180 degrees from the way I look at things, and it seems to be windows centric. No way.
 
PlatformIO isn’t an IDE. It’s a plug-in for whatever IDE you would like to use, including the command line. Its Python-based and is a set of tooling that enables development of a wide range of platforms on a wide range of host platforms.

If I may, I’m guessing that you probably think it’s windows-centric because many people use the plug-in on VSCode. I personally use VSCode with the plugin on a Mac.

Your toolset certainly gives you at least as much power as I have. :) I was asking about all that because I was curious how easy it would be for you to change version numbers and toolchains.

As a sidenote, I would sooooooooooooooo love it if the off-the-shelf Teensys one day supported actual debugging.
 
Well. in that case, it would be good to cleanup the __disable_irq() stuff in the serial port api.

Is there anything else that is looping over disabling interrupts?

That migh be aprt of why interrupt latencies have so much jitter in general.

it would be good to find all of that kind of stuff and fix it.
It is unusual for a serial driver to have to disable interrupts, but for some reason it is necessary with iMXRT. Some time ago I checked the NXP SDK, and it does the same thing. Is disabling interrupts for a fraction of a microsecond a problem for your application? If so, then a serial command interface may not be a good choice.
 
It is unusual for a serial driver to have to disable interrupts, but for some reason it is necessary with iMXRT. Some time ago I checked the NXP SDK, and it does the same thing. Is disabling interrupts for a fraction of a microsecond a problem for your application? If so, then a serial command interface may not be a good choice.

Disabling for a microsecond, and in particular the wrong microsecond, is HUGE.

Doing it in an idiot loop, means that absolutely, for sure, do doubt about it, it is going to land on the wrong microsecond.

Is it a problem for my application?

Yes! It is huge for lots of my applications. It would huge for almost every embedded realtime application that i have ever built over the past 50 years.

And yes, idiot looping on disable global interrupts is a problem in general, and it shouldn't be done.

Sorry for the rant. This seems to push a button for me.
 
Here, I think this might offer a good solution, still within the arduino frameword and without resorting a full blow rtos.

The following are about implementing a spin lock in ARM7 and the NXP in particular.



The proposal is to implement a simplified sort of mutex or semaphore for use between things from loop and ISR's. One such being Serial.available() and the ISR that reads characters onto the usb serial buffer. Rather than loop over __disable_irq() and checking the buffer, the ISR increments a counter and Serial.available() checks the counter and only goes further calling __disable_irq() and fetching characters, when there is something to fetch.

That alone might clean up a host of behaviors, all without having to resort to a full blown RTOS.

Should we start a new topic thread for this?
 
Disabling for a microsecond, and in particular the wrong microsecond, is HUGE.

Doing it in an idiot loop, means that absolutely, for sure, do doubt about it, it is going to land on the wrong microsecond.

Is it a problem for my application?

Yes! It is huge for lots of my applications. It would huge for almost every embedded realtime application that i have ever built over the past 50 years.

And yes, idiot looping on disable global interrupts is a problem in general, and it shouldn't be done.

Sorry for the rant. This seems to push a button for me.
It would’ve idiotic to loop over that. Your thread could call it once per execution, or at a frequency you define. It’s a low priority thread so it’s not a problem. Almost all embedded systems can tolerate that, including motor controls, etc. You seem to have a misunderstanding of how the threads would operate.
 
@joepasquariello

What's wrong with this analysis?

1) main() loops over loop() and yield(). (Aside, what other threads are there apart from ISR's?)

2) If loop() calls Serial.available(), finds nothing and returns, then you are looping over __disable_irq() at a pretty good clip.

3) If you insert a call to delay(), you get another idiot loop, this time over millis.

4) It seems better than just looping over __disable_irq(). But then, you still occasionally land on the wrong microsecond.

Therefore, adding delay() does not cure anything. It is still an intermittent error or "Fail". It is also harder to debug.

QED

I do know threads pretty well, and realtime too, maybe a bit over 1M lines of deployed code, most of it in high reliability applications.

But, I do not yet know the architecture of the arduino sketch well enough and, I have not yet found a document that describes it.

That they seem to eschew atomic memory operations and loop over turning off interrupts instead is a bit of a surprise if it is supposed to be a production environment. But maybe that is the disconnect or maybe it is a carry over from some earlier generation of processors. In either case, it needs to be cleaned up.
 
@joepasquariello

What's wrong with this analysis?

1) main() loops over loop() and yield(). (Aside, what other threads are there apart from ISR's?)
That's up to you and depends on the needs of your application. Typically I have one main thread for control and auxiliary threads for UART, I2C, or SPI, i.e. things that actually execute in parallel and for which a thread can actually wait for hardware events to occur.
2) If loop() calls Serial.available(), finds nothing and returns, then you are looping over __disable_irq() at a pretty good clip.
You can call Serial.available() at any frequency you like, fast or slow, without using delay(), even without an RTOS.
3) If you insert a call to delay(), you get another idiot loop, this time over millis.
See above.
4) It seems better than just looping over __disable_irq(). But then, you still occasionally land on the wrong microsecond.
If your application has a requirement that can't tolerate a 1-us delay in responding to an IRQ, then no, you can't do Serial.
Therefore, adding delay() does not cure anything. It is still an intermittent error or "Fail". It is also harder to debug.
You never, ever have to use delay().
I do know threads pretty well, and realtime too, maybe a bit over 1M lines of deployed code, most of it in high reliability applications.

But, I do not yet know the architecture of the arduino sketch well enough and, I have not yet found a document that describes it.
There is nothing special about Arduino, unless you mean the use of setup() and loop(). Arduino has a main(), just like any other C or ++ application, and it looks as shown below. That's all there is to it.
Code:
void main(void) {
  setup();
  while(1) {
    loop();
    yield();
  }
}
That they seem to eschew atomic memory operations and loop over turning off interrupts instead is a bit of a surprise if it is supposed to be a production environment. But maybe that is the disconnect or maybe it is a carry over from some earlier generation of processors. In either case, it needs to be cleaned up.
Turning off interrupts in T4 serial is not a programming choice, i.e. it could not be avoided via atomic memory operations.
 
Back
Top