Thread safety, please...

Status
Not open for further replies.

Brooks

Well-known member
I've been thinking a lot about the potential of the Teensy 3.5 & 3.6, and their potential for preemptive schedulers. I've written such schedulers in the past on microprocessors with less capabilities than these devices, and I'm planning to do this again with the new Teensies.

I believe a part of our future lies in multi-threading. These processors are too capable to hold them to a single thread.

I've been eyeing the IntervalTimer class (here only as an example) and thinking about the changes I'll have to make to make it thread-safe, or to be able to use it as a base class for something that is thread-safe.

My suggestion is that you keep in mind thread-safety while writing or updating library code. There's generally little cost in making a bit of code thread-safe, and we can hide this cost in the big jump in T3.6/3.6 processor speed. While there would be little return on your time now it might make life simpler in the future.

I wish there were a place to save for general use thread-safe conversions. I don't want to create two sets of code that need to be maintained...
 
Believe me, plenty of work has gone into making things thread safe.

Please don't mistake lack of response on this thread for lack of interest. We usually talk about specifics on this forum when it comes to code and software design.
 
I believe a part of our future lies in multi-threading.
Actually it is virtualization, which implies having multiple threads, among other things. And this is not even theorizing, this has been done already for example in phones, cars, airplanes etc. But this is just a note on the sideline :)

My suggestion is that you keep in mind thread-safety while writing or updating library code.
I had a look at various libraries, and the ones I'd call more feature complete usually contain thread safety in the form of disabling/enabling interrupts for the critical operations. However, I doubt you can expect people to create a more or less hollow abstraction of everything multithread related, i.e. begin/end of mutually exclusive areas, locks and all that, just so you can more easily port the code to systems doing actual multithreading or even having multiple CPU cores. The primary target is the Arduino environment which doesn't have any form of multithreading aside of ISRs. So the IRQ disable/enable is all what is needed (ideally in a fashion where it can safely be nested, but that is another topic).

For my own spare time project I was pondering the idea of using some kind of abstraction for this very purpose, but decided against it in favor of IRQ disable/enable as well. Just not worth the headache considering that if I end up on a different platform I will have to redo pretty much everything due to the very low level / hardware bound code (going the barebone way). So no reason to add code/define clutter.
 
favor of IRQ disable/enable.
I was going to suggest this approach:
Enter safe area:
* Push the interrupt status register
* Disable interrupts

<Do what is needed>

Exit the safe area:
* Pop the interrupt status register

This approach keeps you safe if you call a thread-safe routine from an interrupt service routine. The trouble is that you have to keep track of the fact that you've got stuff on the stack. I don't have a good answer to this; traditionally I've tried to keep the entry and exit points very close together (so their presence is hard to miss), and bury them in code areas that aren't often modified.

I have in mind to create safe_disable() and safe_enable() macros, similar to enable() and disable(). I was wondering if anyone had better ideas...
 
No kidding. I'm thinking this makes more sense on a beagle Bone or raspberry pie. Lots of MHz available....
 
Are you sure that this is needed for Arm-Cortex ?
If you call from an ISR a routine that itself does the disable -- enable sequence, you'll be returning to the ISR with interrupts enabled. Probably it will be ok. But if someone calls the routine and then expects the interrupts to still be enabled (it is an ISR, after all), they're in for a big surprise.

And these types of things can be very difficult to track down...
 
As far as i know, interrupts can interrupt each other without problems on ARM-Cortex (the cpu takes care about it), and there is no need to disable anything..? (Only for special cases like accessing (edit: shared) global variables)
(Am i wrong?)

But anyway, I think Constantin is right. I'm not sure that multithreading is a good thing on microcontrollers.
 
Last edited:
If you call from an ISR a routine that itself does the disable -- enable sequence, you'll be returning to the ISR with interrupts enabled.
Yes. I made saving the status explicit, though (return value in disable, argument in enable), which also gives me greater flexibility in shifting the calls around if I really need it. Usually you keep them on the same level and close together, as you've outlined already.

As far as i know, interrupts can interrupt each other without problems on ARM-Cortex [...]
I read into that a bit in more detail recently, and if I didn't miss anything the ARM core will nest an interrupt only if the interrupt has a higher priority. Same and lower priority interrupts won't nest, which includes the interrupt you are currently serving. As for what happens if you change the priority in the ISR itself ... dunno.

And it is always either about shared resources or critical parts of code (i.e. timing critical or "atomic operations") why we have such things in place.
 
I just did something with this last night - the code assumes interrupts are enabled and disables them ( T_3.6 EEPROM write ) then enables them on exit. If the caller had them off on entry I will have left them enabled on exit - SURPRISE!

If there an easy was to see the state before __disable_irq( ) { //my code } __enable_irq( ) ?
 
If there an easy was to see the state before __disable_irq( ) { //my code } __enable_irq( ) ?
It's been a while since I've looked at the M4 registers, but I *think* you can create an inline assembly routine that grabs the status and returns it. I'll try to find my code...
 
Looking at the Teensy source tree I see many dozens of __disable_irq(), and they can't all be in int() handler code - where they would be expected to be active. It is never conditional and never associated with a state check/restore that I see. This isn't a threaded world - but having unexpected __enable_irq() just done from a function call could be tough to debug afterwards.

In eeprom.c :: eeprom_initialize() is used liberally - if I went in with __disable_irq() - I could come out with __enable_irq(), but I it seems this if() will skip except the first time the EEPROM is accessed - the CPP code for EEPROM only does this once at creation - the eeprom.c code places it a dozen times as there is no common entry point.

With my WIP code to enable high speed T_3.6 EEPROM writes { by dropping HSRUN } - any EEPROM access would return with interrupts enabled on the T_3.6.

void eeprom_initialize(void)
{
/...
if (FTFL_FCNFG & FTFL_FCNFG_RAMRDY) {
/...
__disable_irq();
// do_flash_cmd() must execute from RAM. Luckily the C syntax is simple...
(*((void (*)(volatile uint8_t *))((uint32_t)do_flash_cmd | 1)))(&FTFL_FSTAT);
__enable_irq();
 
Tim, in this case, i'd use my own eeprom-code..
Or edit the original one and insert the HSRUN additions.
 
Tim, in this case, i'd use my own eeprom-code..
Or edit the original one and insert the HSRUN additions.

Frank, editing the original EEPROM code is what I'm working at so it can be tested/used more generally - to have it drop the HSRUN state for the duration of the call if it is found to be high - otherwise the writes just fail.

What is your "own eeprom-code"? And what case?
 

Nice article! And this is the answer I was looking for, smart interrupt re-enable without the stack use of my example. Here's the pertinent code snippet from the article:

uint32_t prim;

/* Do some stuff here which can be interrupted */

/* Read PRIMASK register, check interrupt status before you disable them */
/* Returns 0 if they are enabled, or non-zero if disabled */
prim = __get_PRIMASK();

/* Disable interrupts */
__disable_irq();

/* Do some stuff here which can not be interrupted */

/* Enable interrupts back only if they were enabled before we disable it here in this function */
if (!prim) {
__enable_irq();
}
 

That second link speaks exactly to my concern - the (STM) library solution offered it to always use a common set of INT OFF ON functions that COUNT the requested transitions and only sets them back on when the last OFF call put the counter to zero. I was hoping to see a read from the processor about the state of the interrupts used - like when I scanned all the Teensy tree uses of these functions. ... more to read

The first link is interesting - but Paul already explained most of that in places and it doesn't show how to read the state of the interrupts as ON OFF - just set them globally or selectively.
 
This is quite a high price to pay, performance-wise in every place which needs critical sections that are normally very short.

All so that "stuff here which can not be interrupted" is allowed to call other functions. (it should not)

I do believe we need much better documentation and guidelines about how these and other mechanisms are to be used. For PRIMASK, usage should always be for only very short critical sections.

Exclusive access for longer times needs to be done other ways, such as disabling only a specific interrupt, or with APIs like SPI transactions. Or in the case of an RTOS or other system using threading, with its locking or mutex features which minimize this sort of global interrupt disable.
 
If you call from an ISR a routine that itself does the disable -- enable sequence, you'll be returning to the ISR with interrupts enabled. Probably it will be ok. But if someone calls the routine and then expects the interrupts to still be enabled (it is an ISR, after all), they're in for a big surprise.
 
Just doing the HSRUN drop testing for T_3.6 EEPROM writes - I'm looking to avoid anything that could make it report a failure - or cause surprises later.

EEPROM writes are made to look/seem like a memory write - but they have this issue with failing when HSRUN set over 120 MHz. Obviously they are done rarely - but who know where. If a couple of added instructions make writes at speed work usably, that is better than not having a way to use the T_3.6 EEPROM. And way better than where I started dropping the speed to safely turn off HSRUN, when no interrupts could possibly work with the clocks running at perhaps half speed.

No doubt a few man months went into making HSRUN work to support K66 speeds over 120 for a reason, making sure the processor isn't interrupted for other indeterminate tasks in that time while the processor is 'under powered' seems proper. I put in the code to record PRIMASK state and only __enable_irq( ) if I did the disable.

It is working on what I changed so far - posted an update on EEPROM thread with PRIMASK use.
 
IMO, you shouldn't call other stuff with interrupts disabled.
Ideally, you are right, but this is sometimes easier said than actually done. If I grab a library, put some of my code plus a library call enclosed in disable/enable IRQ, I may already end up with such a scenario.

This opens up a set of questions:
  • Is disable/enable IRQ really required in this context?
  • Does the library documentation hint on the fact that it actually contains an exclusive area / disable/enable IRQ? (Alternatively you can read the source code ... if available.)
  • Are there other ways to protect the integrity of whatever needs to be done? (This question goes really far, from lets say finer grained lock control mechanisms to complete redesign of what is done when.)

Simple thing, but often a cause of headaches: Data structures, which are processed by "normal" code and maybe even a couple interrupt handlers. Changing these data structures usually implies a global lock because that is the easiest way to ensure the data integrity. Ideally there are localized locks on whatever you are changing, and the time needed to change all that is short.

But what is short? For example, I wrote a small buddy memory allocator, and the estimated time for a call (alloc/free) is about 3µs @ 72MHz (i.e. can be less but also more). Is that short enough or could that cause problems already? Currently the functions are not locked at all, assuming that they are called serially (i.e. just from "normal" code). If that is not the case, these functions will cause a global lock, which is kind of broken, if you ask me, but the simplest way to make sure things work as they are supposed to. How can this be changed to something better, though? (Side note here: Memory allocation is nothing else than traversing somewhat more complex data structures, so could be whatever else. This is just used as an example.)


PS: Thanks for the (forum) link!

PPS: Following my own train of thought, the question where it leads to is pretty much like "Could this become a matter of concern?" More powerful devices usually lead to more complex usage scenarios. In return, the Arduino world provides a (mostly) simplified interface, which in combination with some user education might prevent/avoid a good deal of potential trouble. I wouldn't be able to tell what the result of all this will be with the upcoming more powerful Teensy core modules. Knowledge of various ways how to mitigate one or another problem without resorting to the disable/enable IRQ sledgehammer is welcome, though :)
 
Last edited:
PPS: Following my own train of thought, the question where it leads to is pretty much like "Could this become a matter of concern?" More powerful devices usually lead to more complex usage scenarios. In return, the Arduino world provides a (mostly) simplified interface, which in combination with some user education might prevent/avoid a good deal of potential trouble. I wouldn't be able to tell what the result of all this will be with the upcoming more powerful Teensy core modules. Knowledge of various ways how to mitigate one or another problem without resorting to the disable/enable IRQ sledgehammer is welcome, though :)
Lest it be thought this is all theoretical, this is an issue I'm facing today:
* I'm receiving async data from an external servo using DMA
* I want to run an interval timer during the receive in case there's an error. Very basic stuff
* In the normal sequence of events (happens many times a second) I'll stop the interval timer from within the DMA-completion interrupt handler
* If my loop() processing is also using the IntervalTimer class, eventually my interrupt handler will run in the middle of loop()'s access of IntervalTimer. IntervalTimer's static data structures will eventually get messed up.

I want my code to be stable, so I'm going to have to fix this.

I wholeheartedly agree with Spex's comment, but submit to you that the future is now.

Commentary: I worked for some time with Intel 486 processors in a data comm application. The Teensy 3.2 is lots faster than the 486. I'll grant you the M4 architecture is limited in memory and I/O. Paul's T3.6 is clearly in the performance class of the early Pentiums.
 
Status
Not open for further replies.
Back
Top