TeensyTimerTool

luni

Well-known member
TeensyTimerTool - Easy to use interface to the Teensy Timers

I just published TeensyTimerTool on GitHub https://github.com/luni64/TeensyTimerTool. The library provides an easy to use interface to the Teensy timers. Currently it works on the T4.0 and uses the GPT (2x) and QUAD (16x) timer modules/channels. Additionally it provides up to 20 software based timers with exactly the same interface. Extension to the PIT modules and the other boards is planned. All timers can be used in periodic and one-shot mode. You can either specify which hard/software timer you want to use; or -if you don't care- you can get the next free timer from a pool.

Here a quick example how to use a timer from the pool and set it up in one-shot mode:

C++:
#include "TeensyTimerTool.h"
using namespace TeensyTimerTool;

Timer t1;

void setup()
{
    pinMode(LED_BUILTIN,OUTPUT); 
    t1.beginOneShot(callback);
}

void loop()
{ 
    digitalWriteFast(LED_BUILTIN, HIGH);
    t1.trigger(10'000);   // trigger the timer with 10ms delay
    delay(500);
}

void callback()   // switch off LED
{
    digitalWriteFast(LED_BUILTIN, LOW);
}

If you want to use GPT1 instead, all you need to do is to replace the line
Code:
Timer t1;
by
Code:
Timer t1(GPT1)

The gitHub readme contains more examples demonstrating the usage.

Callbacks
In addition to the usual pointer to void functions all timers accept non static member functions, lambdas, funktors etc. as callbacks. This makes it very easy to pass state to the member function or do classes which embed timers and callbacks. Examples here. If for some reason you prefer a plain vanilla function pointer interface you can configure the library accordingly.

Performance
I tried to optimize the performance as much as possible. Any further optimization input would be very welcome. Due to the high speed of the ARM core the performance of the software timers is pretty amazing and actually beats the hardware timers if you don't generate too much instances.

Status
The code base is quite new and probably contains bugs and improvement possibilities.

As always: Any feedback, bugreports and general rants are very welcome.
 
Last edited:
Found time to add some more timers for T3.0 - T3.6. to the TeensyTimerTool. So, currently the library supports the following hard- and software timers.

  • T4.0: GPT1-GPT2(2 x 32bit channel each), TMR1-TMR4 (aka QUAD, 16 x 16bit channels), TCK (20 x 32bit channels)
  • T3.6: FTM0-FTM3 (20 x 16bit channels), TCK (20 x 32bit channels)
  • T3.5: FTM0-FTM3 (20 x 16bit channels), TCK (20 x 32bit channels)
  • T3.2: FTM0-FTM2 (12 x 16bit channels), TCK (20 x 32bit channels)
  • T3.0: FTM0-FTM1 (10 x 16bit channels), TCK (20 x 32bit channels)
  • TLC: TCK (10 x 32bit channels)

Each timer works in periodic and one shot mode.
 
This is very cool looking @luni!

Here is the example I just wrote - with simple change to newest example:
Code:
#include "TeensyTimerTool.h"

using namespace TeensyTimerTool;

void callback(uint32_t& someInt)  // this callback has context, i.e. parameter
{
	Serial.print("loop()/sec=");
	Serial.print(someInt);
	Serial.print(" us=");
	Serial.println(micros());
	someInt = 0;
}

//==============================================================
Timer t;

static uint32_t loopCnt = 0;
void setup()
{
	t.beginPeriodic([] { callback(loopCnt); }, 1'000'000);
}

void loop()
{
	loopCnt++;     // change every second
}

Results in this good looking SerMon output - showing it is being called in a timely fashion - down to the microsecond with no other code holding it up. And giving a loop() per second about where expected:
Code:
loop()/sec=[B]5881751 [/B]us=269300004
loop()/sec=5881752 us=270300004
loop()/sec=5881752 us=271300004
loop()/sec=5881751 us=272300004


Writing it olde Style is a bit faster with added loop() clutter and timer management:
Code:
static uint32_t loopCnt = 0;
elapsedMicros tWait = 0;
void setup()
{
	while ( !Serial);
	tWait = 0;
}
void loop()
{
	loopCnt++;     // change every second
	if ( tWait >= 1000000 ) {
		Serial.print("loop()/sec=");
		Serial.print(loopCnt);
		Serial.print(" us=");
		Serial.println(micros());
		loopCnt = 0;
		tWait -= 1000000;
	}
}

gives this:
Code:
loop()/sec=7316314 us=15417001
loop()/sec=[B]7316313 [/B]us=16417001
loop()/sec=7316313 us=17417001
loop()/sec=7316314 us=18417001

And adding "void yield(){}" to prior code gives:
Code:
loop()/sec=13042097 us=99407001
loop()/sec=[B]13042098 [/B]us=100407001
loop()/sec=13042096 us=101407001
loop()/sec=13042097 us=102407001
 
Here the result for the good old IntervalTimer

Code:
unsigned loopCnt = 0; 

void callback()
{
    Serial.print("loop()/sec=");
    Serial.print(loopCnt);
    Serial.print(" us=");
    Serial.println(micros());
    loopCnt = 0;
}

IntervalTimer t; 

void setup()
{
    t.begin(callback, 1'000'000);
}

void loop()
{
    loopCnt++;
}

Code:
loop()/sec=5881744 us=3300003
loop()/sec=5881745 us=4300003
loop()/sec=5881744 us=5300003
loop()/sec=5881745 us=6300003
loop()/sec=5881744 us=7300003
loop()/sec=5881745 us=8300003

However, I don't really understand why doing that without timers gives more loops()/s? The timers are called only once per second, so any inefficiency should play no role?
 
Last edited:
Something odd on your end ? T_4.0 at 600 MHz?

This code - need to add volatile given _isr() calling:
Code:
volatile unsigned loopCnt = 0; 

void callback()
{
    Serial.print("loop()/sec=");
    Serial.print(loopCnt);
    Serial.print(" us=");
    Serial.println(micros());
    loopCnt = 0;
}

IntervalTimer t; 

void setup()
{
    t.begin(callback, 1'000'000);
}

void loop()
{
    loopCnt++;
}

yields this:
Code:
loop()/sec=[B]13634958[/B] us=43300002
loop()/sec=13634958 us=44300002
loop()/sec=13634958 us=45300002
loop()/sec=13634958 us=46300002

AND SOMETHING REALLY STUPENDOUS ????
Put this before the above code :: void yield(){}

and it gives this? :: 74,992,107
Code:
loop()/sec=[B]74992107[/B] us=53300002
loop()/sec=74992106 us=54300002
loop()/sec=74992106 us=55300002
loop()/sec=74992106 us=56300002

<EDIT>: Also cool that apostrophe ' allows writing 1000000 as 1'000'000!
 
Silly me... now it gives consistent results :)

Shows the trouble of uning _isr()'s :( - as does the following - shows the value of the TCK software 'polling' for end use predictability.

Question: Is there a TTT_yield() to call that will ping the timers without calling the inbuilt yield() or delay()/that calls that yield() - that does the ping of all Serial ports?

I got a higher number 99'989'701, In a while(1) in loop() with IntervalTimer:
loop()/sec=99989701 us=16300002
loop()/sec=99989700 us=17300002
loop()/sec=99989700 us=18300002
It is consistent - but not meaningful.

Of course that is bad form - as the _isr() is doing Serial.print.

Added another volatile and had the _isr() just record the current count, when loop() sees the non-zero count it does the print.

With normal yield() it of course runs slower but consistently:
Code:
loop()/sec=12243658	us=271300002
loop()/sec=12243658	us=272300002
loop()/sec=12243658	us=273300002

With void yield() even with the volatile vars it isn't atomic and goes very wrong:
Code:
loop()/sec=49994830	us=58300002
loop()/sec=99989661	us=59300002
loop()/sec=149984492	us=60300002
loop()/sec=49994827	us=61300002
loop()/sec=99989658	us=62300002
loop()/sec=149984489	us=63300002
loop()/sec=49994828	us=64300002

Not sure if something is missed that explains it other than non-atomic interrupt?:
Code:
void yield() {}
volatile uint32_t loopCnt = 0;
volatile uint32_t loopCntIsr = 0;

void callback()
{
	loopCntIsr = loopCnt;
	loopCnt = 0;
	[COLOR="#FF0000"][B]asm volatile ("dsb"); // otherwise it doubles[/B][/COLOR]
}

IntervalTimer t;

void setup()
{
	t.begin(callback, 1'000'000);
}

void loop()
{
	loopCnt++;
	if ( 0 != loopCntIsr ) {
		Serial.print("loop()/sec=");
		Serial.print(loopCntIsr);
		Serial.print("\tus=");
		Serial.println(micros());
		loopCntIsr = 0;
	}
}
 
Last edited:
Question: Is there a TTT_yield() to call that will ping the timers without calling the inbuilt yield() or delay()/that calls that yield() - that does the ping of all Serial ports?
Currently the Tick Timer has a hard wired override of yield(). (see Teensy/TCK/TCK.cpp, line12 if you want to play) That of course needs to be configurable in the future.

I got a higher number 99'989'701, In a while(1) in loop() with IntervalTimer ... It is consistent - but not meaningful.
Why? Get the same numbers, 100'000'000 increments per second seems to be reasonable for a 600MHz processor? A bit on the slow end but there also is other backgound stuff running...

With void yield() even with the volatile vars it isn't atomic and goes very wrong... Not sure if something is missed that explains it other than non-atomic interrupt?

Could be some caching issue? The loopCtr still sits in the cache when the ISR returns? Can you try adding a 'asm volatile("dsb");' as last line of the ISR? That should wait unit the cache is written before leaving the ISR.


EDIT: Oh, cross post. :)
 
A missing dsb?

Great Tool, Luni!

Thanks Frank - added above - that was needed in the callback()

>> asm volatile ("dsb"); // otherwise it doubles

As noted - _isr() code gets ODD!

How can one tell if the dsb is needed? Many things 'seem' to run the same fine way without it - others - like the PowerOff test we did 3 days ago - clearly show it as critical?

It never double printed when the print code was there? And the code there is odd that it would change the counts that way - unless the second hit after the first hit was timed for breaking the variable transfer not being atomic? But why those numbers?

As noted - _isr() code gets ODD!

It seems that having a "proper debugger" would interfere with seeing things like that properly given how it has to alter program timing? Though maybe the doubled _isr() call would show up directly if one had the patience to walk through the code … but how does one walk through that?
 
Currently the Tick Timer has a hard wired override of yield(). (see Teensy/TCK/TCK.cpp, line12 if you want to play) That of course needs to be configurable in the future.
Scanned the code and found that yield() after first post here - expected it was calling back to the PJRC yield() but didn't find that - so it just replaced it as needed?
Why? Get the same numbers, 100'000'000 increments per second seems to be reasonable for a 600MHz processor? A bit on the slow end but there also is other backgound stuff running...
- it is a meaningless piece of running a counter fast with no meaningful use - the compiler probably would have removed it if not volatile :)
Could be some caching issue? The loopCtr still sits in the cache when the ISR returns? Can you try adding a 'asm volatile("dsb");' as last line of the ISR? That should wait unit the cache is written before leaving the ISR.

Good cross post though - it needed the dsb :)

The dsb prevents the _isr() from re-entering AFAIK. In 'some' cases MCU calls _isr() and without the dsb it doesn't take it off the list and calls it again? So if there were a counter in there as I've seen before it double counts. Perhaps the Serial.print()'s played with interrupts in a way that in effect does what dsb does?

The T4's OCRAM { on chip RAM } in the lower 512KB as configured runs at CPU speed {code and data} so it doesn't pass through a cache.
 
Hi,
I stumbled across this just this morning and thought I'd take a look.
However, trying to compile any of the included examples fails with the following message:

Code:
Arduino: 1.8.10 (Mac OS X), TD: 1.48, Board: "Teensy 4.0, Serial, 600 MHz, Faster, German (Mac)"

In file included from /Users/wpunkts/Documents/Arduino/libraries/TeensyTimerTool/src/TeensyTimerTool.h:43:0,
                 from /Users/wpunkts/Documents/Arduino/libraries/TeensyTimerTool/examples/CallbackWithParams/CallbackWithParams.ino:1:
/Users/wpunkts/Documents/Arduino/libraries/TeensyTimerTool/src/Teensy/GPT/GPT.h: In static member function 'static TeensyTimerTool::ITimerChannel* TeensyTimerTool::GPT_t<moduleNr>::getTimer()':
/Users/wpunkts/Documents/Arduino/libraries/TeensyTimerTool/src/Teensy/GPT/GPT.h:35:56: warning: there are no arguments to 'CCM_CCGR1_GPT1_BUS' that depend on a template parameter, so a declaration of 'CCM_CCGR1_GPT1_BUS' must be available [-fpermissive]
             CCM_CCGR1 |= CCM_CCGR1_GPT1_BUS(CCM_CCGR_ON) | CCM_CCGR1_GPT1_SERIAL(CCM_CCGR_ON);
                                                        ^
/Users/wpunkts/Documents/Arduino/libraries/TeensyTimerTool/src/Teensy/GPT/GPT.h:35:93: warning: there are no arguments to 'CCM_CCGR1_GPT1_SERIAL' that depend on a template parameter, so a declaration of 'CCM_CCGR1_GPT1_SERIAL' must be available [-fpermissive]
             CCM_CCGR1 |= CCM_CCGR1_GPT1_BUS(CCM_CCGR_ON) | CCM_CCGR1_GPT1_SERIAL(CCM_CCGR_ON);
                                                                                             ^
/Users/wpunkts/Documents/Arduino/libraries/TeensyTimerTool/src/Teensy/GPT/GPT.h: In instantiation of 'static TeensyTimerTool::ITimerChannel* TeensyTimerTool::GPT_t<moduleNr>::getTimer() [with unsigned int moduleNr = 0u]':
/Users/wpunkts/Documents/Arduino/libraries/TeensyTimerTool/src/Teensy/GPT/GPT.h:70:44:   required from here
/Users/wpunkts/Documents/Arduino/libraries/TeensyTimerTool/src/Teensy/GPT/GPT.h:35:44: error: 'CCM_CCGR1_GPT1_BUS' was not declared in this scope
             CCM_CCGR1 |= CCM_CCGR1_GPT1_BUS(CCM_CCGR_ON) | CCM_CCGR1_GPT1_SERIAL(CCM_CCGR_ON);
                                            ^
/Users/wpunkts/Documents/Arduino/libraries/TeensyTimerTool/src/Teensy/GPT/GPT.h:35:81: error: 'CCM_CCGR1_GPT1_SERIAL' was not declared in this scope
             CCM_CCGR1 |= CCM_CCGR1_GPT1_BUS(CCM_CCGR_ON) | CCM_CCGR1_GPT1_SERIAL(CCM_CCGR_ON);
                                                                                 ^
/Users/wpunkts/Documents/Arduino/libraries/TeensyTimerTool/src/Teensy/GPT/GPT.h: In instantiation of 'static TeensyTimerTool::ITimerChannel* TeensyTimerTool::GPT_t<moduleNr>::getTimer() [with unsigned int moduleNr = 1u]':
/Users/wpunkts/Documents/Arduino/libraries/TeensyTimerTool/src/Teensy/GPT/GPT.h:71:44:   required from here
/Users/wpunkts/Documents/Arduino/libraries/TeensyTimerTool/src/Teensy/GPT/GPT.h:35:44: error: 'CCM_CCGR1_GPT1_BUS' was not declared in this scope
             CCM_CCGR1 |= CCM_CCGR1_GPT1_BUS(CCM_CCGR_ON) | CCM_CCGR1_GPT1_SERIAL(CCM_CCGR_ON);
                                            ^
/Users/wpunkts/Documents/Arduino/libraries/TeensyTimerTool/src/Teensy/GPT/GPT.h:35:81: error: 'CCM_CCGR1_GPT1_SERIAL' was not declared in this scope
             CCM_CCGR1 |= CCM_CCGR1_GPT1_BUS(CCM_CCGR_ON) | CCM_CCGR1_GPT1_SERIAL(CCM_CCGR_ON);
                                                                                 ^
Multiple libraries were found for "TeensyTimerTool.h"
 Used: /Users/wpunkts/Documents/Arduino/libraries/TeensyTimerTool
Error compiling for board Teensy 4.0.

Could this be a problem with my specific setup?
 
I supect that we will have much more fun with dsb as we know now. We may need to add it to many old libraries, too..... i'd say that a
thing we forgot to do during beta-phase.....
 
Hi,
I stumbled across this just this morning and thought I'd take a look.
However, trying to compile any of the included examples fails with the following message:
Looks like you use an old Teesyduino. A couple of those CCM_CCGR1_... defines changed names in the current version.

I supect that we will have much more fun with dsb as we know now. We may need to add it to many old libraries, too..... i'd say that a
thing we forgot to do during beta-phase.....
Just checked the IntervalTimer, looks like it doesn't implement a DSB at the exit of the ISR. Adding it will slow it down even further but it obviously is required. I'll add DSBs to TeensyTimerTool this evening.
 
Maybe we should add it to every single interrupt, without too much thinking. NXP does it.
Dsb is not nessesary slow. Depends on the number of writes and bus-speeds I think.

As i understand it, the M4 has this issue, too - but way less often due to the slower (compared to the busses) cpu and not being capable of dual-issue.
 
I supect that we will have much more fun with dsb as we know now. We may need to add it to many old libraries, too..... i'd say that a
thing we forgot to do during beta-phase.....

Unfortunately, adding dsb does not work reliably. By resetting or power cycling the board a few times it can be brought into the same state as without the dsb. To exclude any bugs in TeensyTimerTool I extracted the relevant code which shows the same behaviour. I run out of ideas maybe (hopefully) someone finds a bug in the code below. It sets up QUAD1 as counter with a period of 25ms and attaches a simple isr. The output should be a relatively constant value showing the number of loopCnt increments per 25ms and the current micros

Code:
volatile uint32_t loopCnt = 0;
  
void isr()
{   
    TMR1_CSCTRL0 &= ~TMR_CSCTRL_TCF1; // clear the timer flag
 
    Serial.printf("%d %d\n",loopCnt, micros());
    loopCnt = 0;
        
    // asm volatile("dsb");   // no effect
    // asm volatile("dmb");
    // asm volatile("isb");
}

void setup()
{   
    attachInterruptVector(IRQ_QTIMER1, isr);

    CCM_CCGR6 |= CCM_CCGR6_QTIMER1(CCM_CCGR_ON);
    TMR1_CTRL0 = 0x0000;
    TMR1_LOAD0 = 0x0000;
    TMR1_COMP10 = 29'296; // 25ms
    TMR1_CMPLD10 = 29'296;
    TMR1_CTRL0 = TMR_CTRL_CM(1) | TMR_CTRL_PCS(0b1111) | TMR_CTRL_LENGTH;
    TMR1_CSCTRL0 &= ~(TMR_CSCTRL_TCF1);
    TMR1_CSCTRL0 |= TMR_CSCTRL_TCF1EN;
    NVIC_ENABLE_IRQ(IRQ_QTIMER1);

    while (1)
    {
        loopCnt++;
    }
}

void loop()
{
}

Here the 'normal' output

Code:
2499117 1671417
2499118 1696417
2499116 1721417
2499117 1746417
2499117 1771417
2499117 1796418
2499117 1821418
2499116 1846418
2499117 1871418
2499116 1896418
2499118 1921418

After a few resets I can bring it into this state:

Code:
2499115 4796333
4998232 4821333
2499114 4846333
2499116 4871333
4998233 4896333
2499116 4921334
2499115 4946334
4998232 4971334
2499115 4996334
2499116 5021334
4998232 5046334
2499116 5071334

In this state it looks like it does not always clear the loopCnt as it should (values are doubled). Maybe this is some pipeline effect? Adding dsb dmb or isb does not really change anything. This is absolutely weird. I can only hope that I overlooked something....

(and: no, it is not the printout in the ISR, replacing this by just storing the value and display it in the while loop shows the same effect)
 
Got the same number/behavior with p#19 code.

Moved the print from _isr and it never printed until DSB or DMB were added - ISB didn't do anything in quick glance?

Moving the print to the while(1) as below only reduced the counted cycles minimally - but only printed as noted with "dsb"

However the count goes from 'normal' to 'double' count in turn working better or worse at times using TyCommnder 'Reset' or USB Power Off>On.
Worst output with code below:
Code:
7497415	 1044674
2499122	 1069674
4998245	 1094674
7497365	 1119675
2499122	 1144675

So that shows the 'dsb' has some value without Serial.print() in _isr() - perhaps having the print in _isr() affects interrupt system in processing - and doing that behavior understood to be not ideal for _isr() usage.

Code:
volatile uint32_t loopCnt = 0;
volatile uint32_t loopCntIsr = 0;

void isr()
{
	TMR1_CSCTRL0 &= ~TMR_CSCTRL_TCF1; // clear the timer flag

	loopCntIsr = loopCnt;
	loopCnt = 0;

	asm volatile("dsb");   // needed without Serial.print() in _isr
	// asm volatile("dmb");
	// asm volatile("isb"); // xxx
}

void setup()
{
	while (!Serial && millis() < 4000 );
	Serial.println("\n" __FILE__ " " __DATE__ " " __TIME__);
	attachInterruptVector(IRQ_QTIMER1, isr);

	CCM_CCGR6 |= CCM_CCGR6_QTIMER1(CCM_CCGR_ON);
	TMR1_CTRL0 = 0x0000;
	TMR1_LOAD0 = 0x0000;
	TMR1_COMP10 = 29'296; // 25ms
	              TMR1_CMPLD10 = 29'296;
	TMR1_CTRL0 = TMR_CTRL_CM(1) | TMR_CTRL_PCS(0b1111) | TMR_CTRL_LENGTH;
	TMR1_CSCTRL0 &= ~(TMR_CSCTRL_TCF1);
	TMR1_CSCTRL0 |= TMR_CSCTRL_TCF1EN;
	NVIC_ENABLE_IRQ(IRQ_QTIMER1);

	while (1)
	{
		loopCnt++;
		if ( 0!=loopCntIsr ) {
			Serial.printf("%d\t %d\n", loopCntIsr, micros());
			loopCntIsr = 0;
		}
	}
}

void loop()
{
}
 
Well.. there is still a little chance that the reason is USB or the eventresponder (which even gets called in the systick!) - all this adds not reproducable timings.
All these interrupt-disables anywhere in the core are not very helpful, too.
It gets more and more difficult to produce minimal-jitter-timings.


Edit: Rising the priority does not help....
Edit: Maybe we can find something if we disable USb and use a scope? oops.. just found we do not have "no usb" - is there a reason for that?
 
Last edited:
.. commenting out all eventtimer stuff and removing it from systick also didn't work...


Edit: also compiling with gcc-arm-none-eabi-9-2019-q4 doesn't change anything
 
The systick does not have a "dsb", too" :)

Tims, code gets pretty stable with the DSB immedately after
TMR1_CSCTRL0 &= ~TMR_CSCTRL_TCF1; // clear the timer flag


Code:
void isr()
{
  TMR1_CSCTRL0 &= ~TMR_CSCTRL_TCF1; // clear the timer flag
   asm volatile("dsb");   // needed without Serial.print() in _i

So, yes, dsb helps - but it should be near resetting the timer-clear-flag

EDIT: NO ?!?! It worked ONE time - after the reset, all is as before ?!?
 
Yes :) just tried it. Did we find a bug in the Chip? What about that ARM addendum you mentioned in the morning?
 
Back
Top