Is there a good SIMPLE task library?

Status
Not open for further replies.

jwatte

Well-known member
I'm writing my controller code as a sequence of modules that each run their own little state machines.
The main code just loops through them in order and steps them to evaluate inputs and perhaps select a new output step.
This works great, is quite robust, and uses entirely statically allocated data -- perfect for embedded computing.
But, the code for those state machines is somewhat annoying to evolve. (This is the main known draw-back of the method in question)

Code:
void Thing::step() {
  switch (state_) {
    case STATE_ONE:
      if (found_a_thing()) {
        gotoState(STATE_TWO);
      } else if (stateTimeout(300)) {
        gotoState(STATE_FIVE);
      }
      break;
    case STATE_TWO:
      ...
  }
}

So, on bigger systems, where dynamic allocation won't lead to accidental thermonuclear death, we can use threads (separate from pre-emption -- cooperative threads are still threads.)

Code:
void Thing::loop() {
  if (wait_for_a_thing(300)) {
    do_state_two_things();
  } else {
    timeout code
  }
}

The draw-back is -- each state machine has to have its own call stack, and the stack pointer and machine state has to be moved around when task switching, and each call stack must also be big enough to fit the maximum interrupt handler, too.

But, let's say that I'm a lazy slob, and also that the Teensy3.2 has SOO MUCH SPACE that I'm not worried about running out -- what's a good task (thread) library to use? I imagine it needs to support approximately three primitives:
- spawn new task
- yield to other waiting tasks
- some kind of thread suspend/resume

sempahores/locks would perhaps be useful, too, but can be written on top of those primitives.

Note that I'm looking for convenience here -- something that is a drop-in for Teensy would be great. Something that requires I re-tool, download a proprietary IDE, and use the stock Freescale support libraries, not so much.
 
here is a task scheduler library i wrote (Zlich) based on William Gay fibers library here with bunch of examples for the Teensy 3.x/LC.
 
Last edited:
stevech: That's what I'm doing now! It's robust, statically allocated, and 100% predictable. All things I like!
I have about a dozen devices that each run through states talking to multiple I2C, SPI, UART, and other devices. It's getting kind-of ludicrous really :)
Sure, the simplest device is the "blinker" that just turns a LED on and off in some pattern requested by some other component (for stattus) and that's not particularly burdensome.
And then I realize "oh, I need to poll another 32-bit word from controllers 0x80 and 0x81 on bus Serial2" and that means adding states for sending the next poll command, and await ack, and testing timeout, and doing it again for the next device, and the busy-work gets tedious.

Duff presumably wanted to include a link? Meanwhile I'll check out William Gay's fibers, which might be sufficient.

Oh -- the link is to yours, not Will's. Your text confused me :)
 
Last edited:
I've done lots of time critical embedded solutions using multiple finite state machines (FSM). These don't need a stack per FSM. To make any one FSM state-full, make variables that must persist static inside the FSM, or in C, static to the one .c file per FSM.
Or declare a block of RAM as static and cast that to a struct containing persistent variables.
Using a stack per FSM defeats the intended simplicity - when a preemptive scheduler isn't needed (which is most often the case).

For me, I've used this for so many years, it's second nature.
Also C++ makes it hard. For many reasons, I avoid using C++ with dynamically allocated objects (so that includes the String class) as doing the heap and garbage collection is not what I can normally allow in time sensitive apps. Static C++ objects (i.e., instanced at compile time), are OK but I still avoid them due to dangers in how the constructors are written, esp. in terms of I/O.
 
Last edited:
These don't need a stack per FSM

I agree! Where do I say that they do?

For me, I've used this for so many years, it's second nature.

I agree, too! I've done it that way for many years!

But in my other life, working on big boy machines, the convenience of a thread/task/fiber/cooperative-sequential-process is so common, and I hit the limit of "really? another four states just for this?" that I thought it worth asking.
Also, I think the state machine code will actually take more flash program space than the fiber-based code. The control flow ends up costing a lot of instruction space, even when common things like watchdogs for state times are factored out.
I'm going to check out Zilch, and probably make it even more lightweight (I don't need inter-task messaging; globals are fine for me!)

due to dangers in how the constructors are written

I guess here we diverge :)
I understand 100% how C++ constructors (and destructors) work and I use statically allocated C++ objects just fine. I also use C++ objects on the stack, because they can do convenient things like restore interrupt flags to the state they were in when the function entered and such.
I also use none of the dynamic allocation functions (no vector/list, no string, no runtime_error, etc) because malloc/new means you somehow don't know how to keep track of 64 kB of RAM, which seems weak :)
Now, what would be cool, would be if C++ destructors for classes that are only ever statically allocated were 100% elided by the linker -- they are effectively dead code!
My work-around is to just make 'em empty.
 
Last edited:
Oh -- the link is to yours, not Will's. Your text confused me :)
I see what you mean :D
Using a stack per FSM defeats the intended simplicity - when a preemptive scheduler isn't needed (which is most often the case).
Mine or Warrens library is not preemptive and also runs at Thread mode not from Handler mode (isr) so no priorities involved, its pretty simple.

jwatte said:
I have about a dozen devices that each run through states talking to multiple I2C, SPI, UART, and other devices. It's getting kind-of ludicrous really :)
Yep set up task for each driver, If you have shared variable I have a 'spin lock' where it will block the task access to it and just spin until spin lock frees the variable. This based on the gcc builtin atomic memory access.

I use this with the Audio library since my Sketch code can have many tasks running in parallel but since it runs from the Thread level it doesn't interfere with any ISR's that the Audio library uses. But one thing is do not call 'yield' in an isr since it will run the next task in Handler mode and block all thread level tasks. Even though there are exmples of this don't do it, I should delete those.
 
Last edited:
If you have shared variable I have a 'spin lock' where it will block the task access to it and just spin until spin lock frees the variable. This based on the gcc builtin atomic memory access.

The beauty with the non-preemptive threads is that you don't need atomic memory functions. Memory is fully coherent between threads, because yielding only happens inside some library function.
If you share some state between an ISR and main code, you need some kind of exclusion, but atomic-memory-ops really isn't the best for that.

Once you share state with ISRs, you actually only need to disable interrupts (raise mask level to max) to know that you're mutually exclusive -- you still don't need the atomic operations.
It's not until you have multiple bus masters, or multiple cores/CPUs, that atomic memory operations actually are necessary.
BTW: You still need to disable/raise interrupt masks when dealing with spinlocks, because what if you grab a spinlock, and then some higher priority interrupt handler runs, and wants that spinlock, and will deadlock waiting on you to release it?

Also, the way that the blocking primitives are implemented in this library is somewhat basic -- wake up, check the thing, if it's not what you want, yield.
A slightly smarter scheduler would simply know what each task is waiting on, and only wake up a task if that thing changes. (Also known as: semaphores, or similar.) This, too, can be done without atomic memory operations, as long as you don't share state with other bus masters.
 
Years back, I wrote a scheduler called OPEX, in generic C. Not CPU specific. It has cooperative "threads", or FSMs or whatever you wish to call them. And a.k.a. run-to-completion which is a fancy way to say FSM, where completion means finishing work (if any) for the current state.

It also has wait for flag bits, meaning the scheduler doesn't call the FSM until the desired flag bit(s) are true. The ISRs can set these bits, or other FSMs can do so. That avoids the small overhead of calling the FSM to let it check the flag and return or do other work. For persistent variables, OPEX calls the thread with a pointer to a block of memory. The called function can cast that pointer to a struct of vars or a stuct of structs, etc. No heap used for reliability reasons. No C++ dynamic objects.

All this is is single-stack and no preemption and no need for mutual exclusion among FSMs. But mutual exclusion with the ISRs is always needed, scheduler or not. That's easy.

If anyone wants to revive OPEX and use for the Teensy, let me know. It is one of many such cooperative schedulers around. Far simpler than, say, FreeRTOS or ChibiOS for the majority of needs. Indeed, FreeRTOS has a config compile-time option for non-preemptive (cooperative) scheduling ... recommended for most newbies and is adequate for most apps.

The advantage of a cooperative scheduler and FSMs is that the code is clean, modular, easy to read and not a mess of if then else or one big switch statement with messy globals.
 
Last edited:
Status
Not open for further replies.
Back
Top