Hi,
this is about TeensyThreads, the library which is part of the distribution. Perhaps these findings are useful for others.
Doku is here: https://github.com/ftrias/TeensyThreads
Discussion is here: https://forum.pjrc.com/threads/4150...library-first-release?highlight=TeensyThreads
It is a multitasker, which is perhaps a hidden gem to some users.
You can define up to 16 threads, which are loops, including the normal arduino loop().
A lowest priority interrupt from a timer cuts regularly into whatever is running including libraries (but not interrupt routines), will store the registers, will restore the registers for the next thread and resume this next thread where it was interrupted. So it cycles through all threads. Each thread gets a certain time slice for computing. It can use threads.yield() to generously give up it's rest of time. So the next thread starts immediately.
From the Doku it was not really obvious to me, how to set up different priorities and how to set up the right slice time. So I did some experiments with the code below.
This sets the timeslice for one tick to 10 microseconds.
In this case this is 10 * ticks = 100µs per cycle for the second thread.
In my application, I wanted some time critical audio data handling outside the audio library to run with very high priority and have the graphical user interface with ST7735_t3 run with rather low priority. The audio library is running controlled by its own interrupt.
If you have 5 threads running, one of them the normal Arduino loop, then their number of ticks add up to a cycle time together with time for interrupt driven code. Only after this cycle time a specific thread will run again. So you are interested to have a low time slice for a tick. On the other hand switching of threads needs time too, so very low time slices have low efficiency.

The test is using two counters and the count after 1 second is compared for different tick lengths.
The picture shows efficiency for a Teensy 4.1 running at 600MHz. There are two counter threads running as loops without thread.yield(). The first with one tick and the second with 10 ticks priority. If you choose a tick length of 10 microseconds, it still gives an efficiency of 95% for the second high priority thread and 85% for the low priority thread.

The table shows the count numbers in thousands (in 10-fold time you should get 10-fold count) and it shows the maximum cycle time. So for a tick length of 10 microseconds with one thread of 1 tick and one thread of 10 ticks you get a cycle time of 110 us. In this case, the time for interrupt driven code is neglectable.
So all-in-all for Teensy 4.1 @600MHz a tick length around 10 microsecs seems to be something like a sweet spot for still rather good efficiency and low cycle time.
Have fun, Christof
this is about TeensyThreads, the library which is part of the distribution. Perhaps these findings are useful for others.
Doku is here: https://github.com/ftrias/TeensyThreads
Discussion is here: https://forum.pjrc.com/threads/4150...library-first-release?highlight=TeensyThreads
It is a multitasker, which is perhaps a hidden gem to some users.
You can define up to 16 threads, which are loops, including the normal arduino loop().
A lowest priority interrupt from a timer cuts regularly into whatever is running including libraries (but not interrupt routines), will store the registers, will restore the registers for the next thread and resume this next thread where it was interrupted. So it cycles through all threads. Each thread gets a certain time slice for computing. It can use threads.yield() to generously give up it's rest of time. So the next thread starts immediately.
From the Doku it was not really obvious to me, how to set up different priorities and how to set up the right slice time. So I did some experiments with the code below.
Code:
threads.setSliceMicros(10); // Setting is needed!
Code:
int id1= threads.addThread(thread_func, 1); // Start first function with parameter 1
int id2= threads.addThread(thread_func2, 1); // Start second function with parameter 1
threads.setTimeSlice(id2, 10); // set priority. This Thread gets 10 times more time
In my application, I wanted some time critical audio data handling outside the audio library to run with very high priority and have the graphical user interface with ST7735_t3 run with rather low priority. The audio library is running controlled by its own interrupt.
If you have 5 threads running, one of them the normal Arduino loop, then their number of ticks add up to a cycle time together with time for interrupt driven code. Only after this cycle time a specific thread will run again. So you are interested to have a low time slice for a tick. On the other hand switching of threads needs time too, so very low time slices have low efficiency.

The test is using two counters and the count after 1 second is compared for different tick lengths.
The picture shows efficiency for a Teensy 4.1 running at 600MHz. There are two counter threads running as loops without thread.yield(). The first with one tick and the second with 10 ticks priority. If you choose a tick length of 10 microseconds, it still gives an efficiency of 95% for the second high priority thread and 85% for the low priority thread.

The table shows the count numbers in thousands (in 10-fold time you should get 10-fold count) and it shows the maximum cycle time. So for a tick length of 10 microseconds with one thread of 1 tick and one thread of 10 ticks you get a cycle time of 110 us. In this case, the time for interrupt driven code is neglectable.
Code:
/* To find Out about Timing and Priorities of TeensyThreads
* CWE 05.02.2022
* Teensy 4.1 @600MHz
* Slice: 1µs 1: 623k 2: 47413k Max: 14 Min: 12
* Slice: 3µs 1: 4436k 2: 77669k Max: 42 Min: 33
* Slice: 10µs 1: 7687k 2: 86931k Max: 110 Min: 110
* Slice: 30µs 1: 8620k 2: 89595k Max: 330 Min: 330
* Slice: 100µs 1: 8955k 2: 90598k Max: 1100 Min: 1100
* Slice: 300µs 1: 9070k 2: 91062k Max: 3300 Min: 3300
* Slice: 1000µs 1: 9084k 2: 90955k Max: 11000 Min: 11000
* Slice: not set 1: 52770k 2: 47977k Max: 21000 Min: 21000 => no priority
* https://github.com/ftrias/TeensyThreads
*/
#include <TeensyThreads.h>
volatile long int count = 0, oldCount, count2 = 0, oldCount2;
volatile uint32_t lastMicros, diffMicros, maxMicros=0, minMicros= 100000;
void thread_func(int inc) {
while(1) count += inc;
}
void thread_func2(int inc) {
while(1) count2 += inc;
}
void measCycle() { // Measure Cycle Time for all Threads in µsec
lastMicros= micros();
while(1) {
uint32_t nowMicros= micros();
diffMicros= nowMicros-lastMicros;
maxMicros= max(maxMicros, diffMicros);
minMicros= min(minMicros, diffMicros);
lastMicros= nowMicros;
threads.yield();
}
}
int slice= 10; // Variation here
void setup() {
threads.setSliceMicros(slice); // Setting is needed!
//threads.setSliceMillis(500);
int id1= threads.addThread(thread_func, 1);
int id2= threads.addThread(thread_func2, 1);
threads.setTimeSlice(id2, 10); // set priority. This Thread gets 10 times more time
int id3= threads.addThread(measCycle);
}
void loop() {
int getCount= count;
int getCount2= count2;
Serial.printf("Slice: %dµs 1: %dk 2: %dk Max: %d Min: %d \n", slice,
(getCount-oldCount)/1000, (getCount2-oldCount2)/1000, maxMicros, minMicros);
oldCount= getCount;
oldCount2= getCount2;
maxMicros=0;
minMicros= 100000;
threads.delay(1000);
//delay2(1000);
//delay(1000);
}
void delay2(uint32_t ms)
{
int mx = millis();
while(millis() - mx < ms);
}
So all-in-all for Teensy 4.1 @600MHz a tick length around 10 microsecs seems to be something like a sweet spot for still rather good efficiency and low cycle time.
Have fun, Christof