Teensy 4.1: release thread from interrupt

tjaekel

Well-known member
I think, the missing real RTOS hits me. I cannot accomplish what I want to implement:
  • when GPIO interrupt is triggered - a thread should be released (running outside INT context)
  • the interrupt handler should unlock a semaphore, "give from ISR", so that as the next code the thread waiting on semaphore starts running (immediately)

It looks like, this is not possible, not supported (due to a real RTOS is missing).
I use TeensyThreads, fine, but no way to use semaphores, esp. no means available to trigger a thread from inside an ISR handler.
(TeensyThreads does not have semaphores, using Mutex in ISR handler crashes, and not a real scheduler which would schedule right after INT ISR)

How to do?
I do not want to set a shared global variable and at the end I had to poll this variable (it would be still a polling approach!)
It would not be immediately (in real-time) when INT was fired, instead I would just "realize" the variable has been changed after
an elapsed delay, e.g. when I come back to loop().
I do not have anything in loop(), all runs as threads: (loop() is like "idle thread" (and delays several milli-seconds there).

I need to trigger a thread which handles my interrupt. But outside of the MCU INT context (inside ISR other INTs are blocked, Serial.print()
and other functions using HW INTs are not possible!).
But after a GPIO INT was triggered - I want to fire SPI transactions, needing as well interrupts - not possible right now.

Polling a global variable is "unpredictable" when it will be seen, the interrupt latency would be too large, unpredictable when, not real-time.
I want to have my ISR handler, running as a thread, being scheduled immediately after the HW ISR function has released a semaphore and ISR returns from INT
(as CMSIS FreeRTOS does).

The lack of a real RTOS, e.g. CMSIS FreeRTOS, a real scheduler which works together with interrupts, is "painful".

Thinking about how to make it available:
a) try to port and integrate CMSIS FreeRTOS (or any other RTOS with support for INT handling and scheduling)
b) implement an assembly code based "context switch", e.g. modify the LR register and other MCU registers (where to continue after INT ISR will return) and jump to my ISR thread handler
(running outside the INT context, able to use also INTs)
c) but tricky: I want to jump immediately after INT handler "returns from interrupt" to the ISR thread handler. But when this one finishes - it should continue at the code line
where the INT has kicked in. So, I had to save and restore the context properly. A thread is interrupted, a thread is running after ISR, but it should go back to original interrupted thread.
d) usually, an RTOS would use the PendSV interrupt to do this job: set pending INT during a HW INT, so that scheduler is invoked right after return from HW ISR (by PendSV is triggered).
No idea if PendSV vector can be rerouted to FreeRTOS (or to my scheduler) inside the Arduino/Teensy LIBs: the INT vector table must be altered.

Does anybody have a solution (or better idea) - how to get a thread triggered from an ISR? (and executed immediately right after ISR has finished)?
Such an ISR would just have a semaphore set to release the ISR thread handler which runs outside INT context as any other thread and can use any other function.
Do not do any polling of a shared global variable (unpredictable when it will be seen)!
 
I think, the missing real RTOS hits me. I cannot accomplish what I want to implement:
  • when GPIO interrupt is triggered - a thread should be released (running outside INT context)
  • the interrupt handler should unlock a semaphore, "give from ISR", so that as the next code the thread waiting on semaphore starts running (immediately)

Does anybody have a solution (or better idea) - how to get a thread triggered from an ISR? (and executed immediately right after ISR has finished)?
Such an ISR would just have a semaphore set to release the ISR thread handler which runs outside INT context as any other thread and can use any other function.
Do not do any polling of a shared global variable (unpredictable when it will be seen)!

If it's critical for your task to run as soon as the ISR exits, you could use FreeRTOS. I don't like that solution because it's not fully supported and requires some (small?) changes to the Teensy core. I prefer to use a cooperative RTOS and then manage what is done within an ISR and what is done at task level. For my projects, this method works well. If something is truly time-critical, it will be done in the ISR, and the result is posted to a mailbox or queue. The FreqMeasureMulti library is a good example of how you can do this with no RTOS, but as you say, you must "poll" the queue from task level. With a cooperative OS, the delay between the ISR and the task-level processing will vary, but that's not an issue in my projects. Things occur fast enough. SPI is an interesting question, such as if you wanted to use an SPI DAC and sample at precise intervals. I haven't done this, but instead of using the SPI library, I would configure/start the SPI transfer in the ISR, then exit the ISR and wait for the SPI complete at task level. The Teensy SPI library has some support for interrupts, but I have never used that feature, and I'm not sure whether it would help in your case.

Can you tell us more about what your system is doing, and the actual timing requirements? Someone might be able to suggest other solutions.
 
Thank you.
Any polling, even a shared variable set by ISR is not "fast enough".
Meanwhile, I have figured out that FreeRTOS works (if you download and install the ZIP): no modification needed, works as I need (a separate post afterwards with details).

My system (and why do I need "real-time" on ISR):
The MCU is connected to an external chip (via SPI). This is my "DUT" (Device Under Test).
This chip generates an interrupt (e.g. it is running a scan, assume a health sensor). When this INT happens - I have to drain the scan results - immediately!
It keeps going to scan, even I would not drain the result FIFO (via SPI). If I am too late - I get a FIFO overflow, timeout etc.
So, when the HW INT (GPIO) comes - I have to start immediately to fire a SPI transaction in order to drain the results.
And if the scan rate is 1 KHz (and it is) - every 1 ms delay somewhere, e.g. in loop(), just to "poll" for the shared variable set that INT was triggered - is too late.
So, I need really a HW INT on GPIO, an ISR but this should release immediately to schedule my ISR Handler Thread (where I can do other functions like SPI which need also HW INTs).

Never mind: I have found that FreeRTOS for Teensy 4.x works as I need (and use on other MCUs, like STM32).
And it works "out of the box", just install the ZIP as LIB, no tweaks (for now) on other existing LIBs.
 
Here is the solution: FreeRTOS for Teensy 4.x
I found and tried this:
https://github.com/tsandmann/freertos-teensy

Just "Sketch -> Include Library -> Add .ZIP Library" and it is there.
(for now, nothing to modify on other LIBs, IDE etc.)

Here is the Sketch using a GPIO INT, with ISR, but ISR releases via notification "from ISR" my ISR Handler Thread.
This ISR Handler Thread is a regular thread: you can do all what you want to do there, e.g. also Serial.print(), SPI transaction etc. (using other HW INTs).
Works for me (as this simple example).

It works this way:
It creates three threads (tasks). One is my ISR Thread Handler (running as regular thread).
The GPIO ISR sends a notification to the blocked, waiting ISR Handler Thread.
This is done via "vTaskNotifyGiveIndexedFromISR": send from ISR as notification, so that when ISR returns - the scheduler sees that my waiting on a higher prio thread.
ISR Handler Thread will be released. So, it runs immediately after the ISR has finished. Not waiting for a polling period elapsed (and nothing in loop()).
Nothing to poll! And no delays, quite immediately handling all the stuff for the INT (but outside the ISR).

Code:
/*
 * This file is part of the FreeRTOS port to Teensy boards.
 * Copyright (c) 2020 Timo Sandmann
 *
 * This library is free software; you can redistribute it and/or
 * modify it under the terms of the GNU Lesser General Public
 * License as published by the Free Software Foundation; either
 * version 2.1 of the License, or (at your option) any later version.
 *
 * This library is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 * Lesser General Public License for more details.
 *
 * You should have received a copy of the GNU Lesser General Public
 * License along with this library. If not, see <http://www.gnu.org/licenses/>.
 *
 * ----------------------------------------------------------------------------
 * Extend this example to use a GPIO ISR (real HW INT):
 * - trigger an ISR Handler Thread, running outside of MCU INT context
 * - do regular stuff, e.g. Serial.print() in this ISR Handler Thread
 * - send a notification from HW ISR to the waiting ISR Handler Thread
 */

/**
 * @file    main.cpp
 * @brief   FreeRTOS example for Teensy boards with GPIO interrupt and ISR
 * @author  Timo Sandmann, modified: Torsten Jaekel
 * @date    06/06/2023
 */

#include "arduino_freertos.h"
#include "avr/pgmspace.h"

//#define ULONG_MAX 0xFFFFFFFF
#include <climits>

static TaskHandle_t xTaskToNotify = NULL;

static void task1(void*) {
    while (true) {
        ::digitalWriteFast(arduino::LED_BUILTIN, arduino::LOW);
        ::vTaskDelay(pdMS_TO_TICKS(500));

        ::digitalWriteFast(arduino::LED_BUILTIN, arduino::HIGH);
        ::vTaskDelay(pdMS_TO_TICKS(500));
    }
}

static void task2(void*) {
    while (true) {
        ::Serial.println("TICK");
        ::vTaskDelay(pdMS_TO_TICKS(1'000));

        ::Serial.println("TOCK");
        ::vTaskDelay(pdMS_TO_TICKS(1'000));
    }
}

static void taskISR(void *pvParameters) {
  /*
   * this is out ISR Thread Handler, running in regular context (not as INT context),
   * so, we can do all we want to do, e.g. Serial.print() or using other functions needing INTs
   */

  uint32_t ulNotifiedValue;
  //we have to tell who is the receiver of the notifification, here: our own thread
  xTaskToNotify = xTaskGetCurrentTaskHandle();

  while (true) {
    ::xTaskNotifyWaitIndexed( 0,                /* Wait for 0th notification. */
                              0x00,             /* Don't clear any notification bits on entry. */
                              ULONG_MAX,        /* Reset the notification value to 0 on exit. */
                              &ulNotifiedValue, /* Notified value pass out in ulNotifiedValue. */
                              portMAX_DELAY );  /* Block indefinitely. */

    ::Serial.println("INT Handler triggered");
  }
}

void GPIO_Interrupt() {
    /*
     * this is our HW ISR: it sends a notification to scheduler so that
     * out ISR Thread Handler is activated right after this ISR has returned
     */

    BaseType_t xHigherPriorityTaskWoken = pdFALSE;

    /* At this point xTaskToNotify should not be NULL as a transmission was
    in progress. */
    configASSERT( xTaskToNotify != NULL );

    /* Notify the task that the transmission is complete. */
    vTaskNotifyGiveIndexedFromISR( xTaskToNotify, 0, &xHigherPriorityTaskWoken );

    /* If xHigherPriorityTaskWoken is now set to pdTRUE then a context switch
    should be performed to ensure the interrupt returns directly to the highest
    priority task.  The macro used for this purpose is dependent on the port in
    use and may be called portEND_SWITCHING_ISR(). */
    portYIELD_FROM_ISR( xHigherPriorityTaskWoken );
}

FLASHMEM __attribute__((noinline)) void setup() {
    ::Serial.begin(115'200);
    ::pinMode(arduino::LED_BUILTIN, arduino::OUTPUT);
    ::digitalWriteFast(arduino::LED_BUILTIN, arduino::HIGH);

    ::delay(5'000);

    if (CrashReport) {
        ::Serial.print(CrashReport);
        ::Serial.println();
        ::Serial.flush();
    }

    //configure GPIO pin for HW INT
    pinMode(23, arduino::INPUT_PULLUP);                //enable pull-up
    attachInterrupt(digitalPinToInterrupt(23), GPIO_Interrupt, arduino::FALLING);

    ::Serial.println(PSTR("\r\nBooting FreeRTOS kernel " tskKERNEL_VERSION_NUMBER ". Built by gcc " __VERSION__ " (newlib " _NEWLIB_VERSION ") on " __DATE__ ". ***\r\n"));

    ::xTaskCreate(task1, "task1", 128, nullptr, 2, nullptr);
    ::xTaskCreate(task2, "task2", 128, nullptr, 2, nullptr);
    //create our ISR Handler Thread
    ::xTaskCreate(taskISR, "taskISR", 128, nullptr, 2, nullptr);

    ::Serial.println("setup(): starting scheduler...");
    ::Serial.flush();

    ::vTaskStartScheduler();
}

void loop() {}

So, I will use this FreeRTOS LIB on Teensy 4.1. and change away from TeensyThreads to this real FreeRTOS approach
(much better: it has a scheduler, can schedule a thread from ISR ...)
 
So, I will use this FreeRTOS LIB on Teensy 4.1. and change away from TeensyThreads to this real FreeRTOS approach (much better: it has a scheduler, can schedule a thread from ISR ...)

Good to know that FreeRTOS will work for you. You can always avoid delay() by calling millis() and checking for the desired end time. "Polling delay" for a signal from ISR can easily be kept in the low microseconds, and cooperative multi-tasking can simplify things quite a bit.
 
What do you mean by "cooperative multi-tasking"?
I do not want to argue, but any SW design which needs delau() or even miilis() is a not real-time (for me a "bad design").
A real time system should work based on events, notifications, semaphores, not any polling, not any "delay() needed".

Assume this code:
Code:
void loop() {
   if (mySharedVarSet) {
       //do something, e.g. INT was triggered
   }
}

Do you assume that my network handlers, my ETH server would be running? No

So, I could do this:
Code:
void loop() {
  if (mySharedVarSet) {
       //do something, e.g. INT was triggered
  }
  if (ETHhasReceived) {
      //process my network traffic, do heavy stuff
  }
}

What happens when parsing my ETH receiver traffic, do something ... and this will take a minute?
My GPIO INT would be realized just after a minute. How bad is this?
Not real time!
Instead, I want to have that my GPIO INT is handled, even I am right in the middle with my network stuff.
I cannot wait until all my network stuff is idle (it might be never idle because I want to send my SPI transaction results via network).

So, "cooperative" means for me: not real-time: the longest task/thread determines my response time.
I have a higher important (prio) task to do, e.g. during a network session to act "immediately" to an GPIO interrupt (because the external chip
needs to be handle without delay).

And: what happens in a "cooperative mutli-tasking" system, when one of the tasks/threads used delay() (or miilis())?
It will block all other for this duration. Any delay() used is like "pausing the MCU". Using delay() is still "using polling", it does not matter if INT handlers
are also used, when the "polling" of a shared variable would not realize "immediately" (just when all goes idle and the delay() has expired).

A real HW interrupt, handled in real-time, should not be considered as "cooperative". Instead: threads/task with priorities and scheduled "immediately" (for real-time action).
A "cooperative" design for a real-time system is quite impossible. (I had to check during a long running task periodically if another task has to be "allowed", which makes this
task running even longer, e.g. every few instructions such a check to "give the other one a chance" but nothing to do, just wasted time).

Using delay() or millis() is burning power and wasting MCU processing capabilities. Real time needs an event driven system (and not polling), with real INT support for events and scheduling.
 
What do you mean by "cooperative multi-tasking"?

Cooperative means non-preemptive, with each task voluntarily giving up the processor.

I do not want to argue, but any SW design which needs delau() or even miilis() is a not real-time (for me a "bad design"). A real time system should work based on events, notifications, semaphores, not any polling, not any "delay() needed".

Cooperative and preemptive OS can have all of the same types of features for handling events, mutual exclusion, and inter-task communication. The only difference is that in the cooperative system, task swaps occur only on OS calls, e.g. waiting for an event, message, OS-based delay, etc.
 
On the Teensy, millis() calls are pretty cheap because they just read a 32-bit variable that’s updated by the “systick” interrupt. That subsystem is running all the time anyway.
 
On the Teensy, millis() calls are pretty cheap because they just read a 32-bit variable that’s updated by the “systick” interrupt. That subsystem is running all the time anyway.

Yes, and the cooperative system can have a system timer tick, just like FreeRTOS, so “delay” just means “let other tasks run for the next X ms”.
 
A "cooperative system" is not helpful for me: I need real-time and preemption:
a GPIO INT thread should be scheduled immediately.

FreeRTOS works fine:
Command Line, HTTP sever, TFTP - are now FreeRTOS threads, plus the GPIO INT handler has a higher prio thread, released from interrupt

Just: AsyncWebServer does not work with FreeRTOS (but QNEthernet as TCP listener).
My impression: it needs so much local variable space, the FreeRTOS thread stack overflows, or it hangs (assuming a stack corruption).

Project works now as it should, as real-time system, with preemption and thread priorities.
No polling of shared variables, real threads.

My GIT project is here:
https://github.com/tjaekel/Teesny_4_1

It has FreeRTOS threads for:
Command Line (UART), plus Pico-C interpreter running in it
HTTP server (port 80), for WebBrowser and Python scripts, including BINARY mode
TFTP, running in background, just initialize the SD card via "sdinit 1" before a transfer of a file
GPIO interrupt: ISR releases a handler thread from interrupt (not polling, not cooperative, as "real-time", scheduled immediately)
 
Neat! When using QNEthernet, just make sure that all calls to the library are done from the same thread. I made its lwIP usage single-threaded and there’s a bunch of internal assumptions that rely on there being no concurrent access to the library. You could change that, of course, by modifying how lwIP is used, but at that point, using direct lwIP access may be better.
 
Actually, I do not do, I do not care - and it works still (actually obvious for me):
I have at least TWO threads using QNEthernet: one for the HTTP Server (port 80) and another one for TFTP.
And the port 80 server accepts several clients connected at the same time (e.g. WebBrowser and a Python script, but as single thread still).
BTW: I will add a third one for an UDP Sender - on every GPIO INT an UDP packet should be sent. YES - this could become an issue with QNEthernet.

It works, even with several threads.
But just by "constraints" behind the scene:
- the threads are not triggered by a real INT: they just do vTaskDelay(1);, running in "polling mode", waiting for something to do
- all threads using QNEthernet have the same priority - so, they are not really "preemptive" (thread keeps running due its prio to the end of thead-loop)
- so, technically, it is a "single threaded system": as long as they do not interrupt/intercept each other - all should be fine
even running in different FreeRTOS threads: as long as one running thread is not scheduled and would interrupt another one ("preemptive", it does not for now) - all should be fine

I agree with you: the problem can occur, when one running thread is "interrupted/intercepted" by another thread. And now
the QNEthernet is called in a "recursive mode" (calling a function again when I am still in this function).
I have to see how "thread-safe" the QNEthernet implementation is (as long as they do not use global variables/shared resources - it should be fine).
With "polling", using vTaskDelay(1); and same threads prios - it should not happen (and does not seem to happen - all fine).

Direct lwIP: how to do?
I am struggling to use native libraries. At the end it will not help, because the "native" lwIP would need a code change, e.g. to use threads,
seamphores, mutexes etc., to make it "multi-thread-safe". And QNEthernet seems to use the lwIP LIB, but without any FreeRTOS support:
it tells me that lwIP is anyway not configured for using an RTOS. And modifying lwIP LIB breaks QNEthernet.

Never mind: it works fine for me, even with multiple threads.
Potentially, I can hit an issue when sending UDP, triggered by a real GPIO INT, with higher prio, and a parallel access (from WebBrowser, Python script or TFTP) is going on.
We will see: I am thinking being able still to handle by putting "Critical Sector" semaphores around it. (which I need also for Serial.print() ).

My "assumption":
as long I can make sure that QNEthernet functions are called "exclusively", never mind in/from which thread, but not "preemptive" (resulting in "recursive" calls), - it should be fine.
And I could solve still by putting a "Critical Section" around the entire QNEthernet stuff.

All fine for now. Thank you.
 
Back
Top