Teensy 4.1 Ethernet + SD -- firmware design

karelv

Member
The goal is to setup a state of the art firmware on teensy 4.1, serving both ethernet and SD card.
I found there is NativeEthernet, but there is also QNEthernet.

So my first question is:
What criteria should I use to make my selection between those two libraries?
Note: I want that the firmware keeps running without the MQTT broker, even without ethernet cable.
Note2: When broker (or ethernet) comes online, the teensy must reconnect.
Note3: Preferable without using the brownout reset function.

Then I want a fast response from the teensy 4.1 to the MQTT message that are sent to it.
(PubSubClient)
However at the same time(well just in parallel) I want to write to SD card.
It might happen there is no SD card present, and this might take a while to discover this, meaning the teensy is blocked into the SD-routines for a time longer than I want for MQTT response.

So my second question is:
How do I cope with that?
Do I spin a new thread for the network routines, and keep SD card in main-loop?
Do I spin a new thread for the SD routines, and keep network in main-loop?
Or are there other than thread option?

Note: I have a 10ms timer interval ISR routine in parallel to this, but I think it is not relevant (it's a very short service routine).

Thanks,
Karel.
 
The goal is to setup a state of the art firmware on teensy 4.1, serving both ethernet and SD card. I found there is NativeEthernet, but there is also QNEthernet.

So my first question is: What criteria should I use to make my selection between those two libraries?

Use QNEthernet. NativeEthernet is no longer updated or supported.

Then I want a fast response from the teensy 4.1 to the MQTT message that are sent to it.
(PubSubClient) However at the same time(well just in parallel) I want to write to SD card. It might happen there is no SD card present, and this might take a while to discover this, meaning the teensy is blocked into the SD-routines for a time longer than I want for MQTT response.

So my second question is:
How do I cope with that?

Start experimenting!
 
Thanks @joepasquariello, this is useful, QNEthernet it is!
Of course, I will experiment, and feedback the forum what happened.
 
A few notes about the QNEthernet library that may assist you:
1. It doesn’t use timers or interrupts to handle data or events. It’s designed to be used with a single-threaded approach.
2. You can add listener functions that are called when various events such as link-down occur.
3. It’s possible to use the library in a non-blocking manner.
 
Thanks @Shawn!
Given these inputs, I start my experiment with:
- Use QNEthernet lib.
- Do ethernet stuff in the main loop.
- Put SD stuff in a thread.
- Have my 10ms timer do little things on I2C.
- Guard my shared resources with a mutex.
- Use cppQueue/mutex for communication between interrupt, SD-thread and main-loop.
Let's see where this brings me!
 
This discussion (in the QNEthernet repo) may be of use:
Highly random and slow data output with large transfers over TCP (using TeensyThreads) #39

I find that it’s possible for threads use to slow things down. You can accomplish a lot with an event loop that loops through the program components, checking each one for “ready”, and if so, “do some stuff”. Being “ready” can include things like “has 10ms elapsed yet?” sorts of statements. Maybe threads is the way to go for super-precise timing, but the various timer classes can accomplish the same things instead of threads.

Another useful link from the forum:
QNEthernet and TeensyThreads causes assertions/crashes. ...solved!
 
Thanks Shawn.
My first experiment is not going very well... it turns out one cannot use a mutex in a ISR, I have tried also the handshaking with volatile variables, but it ends in a dead lock. So I have no good alternative to mutex way for the resource-lock inside of ISR routine. I saw here that there is a freeRTOS option, but that sounds like a lot of overkill.

I was super drawn to use timer ISR, as I read those ISR keep running even when the main program would block, and thus I could keep my primary function running!
Now I guess those timer ISR is more for something that requires high timing accuracy such that you can/have to put the necessary attention to get that right.
But for my use case, I believe it is overkill.

So now I'm going with Shawn's advice, and see if I can make my program event-loop driven. However some libraries does not release the execution pointer volonteerly, so for those I still might want to add a thread, and go with mutex for the resource allocation.


Shawn you stated:
the various timer classes can accomplish the same things
Can you (or anybody) give an example of a good timer class?
 
Worth reading this: https://dl.acm.org/doi/pdf/10.1145/37499.37509 - section on semaphores page 86.

My favorite synchronization primitive is the EventCounter, a.k.a. EventCount, which allows producer/consumer relationships to be synchronized without any mutual exclusion (so it works with interrupts). You need one counter for the producer, one for the consumer. It works well integrated into a circular buffer implementation as you simply take the counter values modulo the buffer size to index it, and with power-of-two size buffer you can easily handle wrap-around and not need a long long.
 
Thanks @MarkT, I guess you meant page 96.
This might come in handy for another project some time in the future, for now, I'm looking into the 'event-loop' concept. And thus make it as light on ISR & Threads as possible.
 
I use ISRs by setting a volatile flag when the interrupt triggers, and then checking that flag in the event loop. Fortunately, in my use cases, the time between the two, flag set to flag check, hasn’t been critical for me. For example, QNEthernet does this in ‘enet_isr()‘ in lwip_t41.c. (In my case, because of how atomic flags work, I clear the flag when the interrupt happens and then reset it upon check.)
 
Shawn, I do similar, but instead of a loop I use cooperative multitasking. This lets me use the same structure as for a preemptive OS, but avoid the challenges of having task switches occur anywhere within a task. I have a simple set of synchronization and communication methods (mailboxes, queues, etc.) and a timeout mechanism. Any time a task has to wait for something, it calls yield(), which is overridden to be a task switch. As long as no task runs too long without yielding, everything gets done soon enough.
 
The two multitasking approaches are equivalent, differing only in where the switching logic is and what the API looks like. In both cases, tasks run until they decide they’re complete (including checking timeouts for timers) and then the next task runs. Yours probably looks more like a threading API, though. :) I’d say performance is also equivalent, as is capabilities. (@joepasquariello you probably know all this, I’m just outlining for others. :))

Most software design, in my mind, falls into two approaches: “pull” and “push”. I’d call the event loop approach a “pull” approach, and the yield() override approach a “push” approach. (At least, this is my own mental model.) Remember XML “push” vs. “pull” parsing? :)
 
shawn, is each task in your method a state machine? I remember doing things that way for Palm.

It frequently is, but doesn't have to be, especially when the task is a simple "is there something to process? if not, then pass control along to the next."

I'll add: It leads to a very bespoke approach to the problem, and the "yield()-overriding push (if you'll allow me that term)" approach leads to a more generic API much sooner.
 
Shawn, I do similar, but instead of a loop I use cooperative multitasking. This lets me use the same structure as for a preemptive OS, but avoid the challenges of having task switches occur anywhere within a task. I have a simple set of synchronization and communication methods (mailboxes, queues, etc.) and a timeout mechanism. Any time a task has to wait for something, it calls yield(), which is overridden to be a task switch. As long as no task runs too long without yielding, everything gets done soon enough.
Do you actually override yield() or use the EventResponder mechanism?
 
Do you actually override yield() or use the EventResponder mechanism?
Yes, I override yield() to be a cooperative task switch. The code below is from Arduino core file hooks.c, and it seems pretty explicit that this is how yield() was intended to be used. EventResponder may have some advantages, but I find this model much easier to understand.

Code:
/**
 * Empty yield() hook.
 *
 * This function is intended to be used by library writers to build
 * libraries or sketches that supports cooperative threads.
 *
 * Its defined as a weak symbol and it can be redefined to implement a
 * real cooperative scheduler.
 */
static void __empty() {
    // Empty
}
void yield(void) __attribute__ ((weak, alias("__empty")));

The attached file contains my cooperative kernel and a test sketch. It works on Teensy LC/3.x/4.x/MM and also on SAMD51 boards from AdaFruit. If you run the example on T4.x, you'll find it can do somewhere between 14M and 17M task switches per second. I have an "OS" that layers on top of the kernel, with a simple wrapper for creating tasks, a periodic tick mechanism for timeouts, and a set of mailbox, queue, and semaphore functions for communication and synchronization. This lets you create periodic tasks, wait for data or signals from ISRs, etc. I'm pretty close to having the OS layer cleaned up enough to share. It's very "C-like", because I still use the original version on 683xx projects where I have a very old (30+ years) compiler.
 

Attachments

  • FsKernel0.zip
    5 KB · Views: 90
Small update: I was traveling, other priorities... but the project is progressing well (slowly but surely).

I faced a couple issues:
- when the mosquitto service (MQTT) goes down (while the server is up), the Teensy try to reconnect, but that takes 1 second, even when no success. I found that can use the client.setConnectionTimeout to reduce that to ~5ms. But then when it actually connects (the mosquitto service goes up again), it blocks for 370ms. It is currently not solved, but with this PR: https://github.com/knolleary/pubsubclient/pull/567/files and when I implement the yield function to give 'ticks' to the 'arduino-timer' instances, I think it could be solved.... So my question to @joepasquariello and other, is it acceptable to implement your own 'yield' function in the 'ino' file?
- Similar issue I found when the SD card is not present, it blocks everything for 3 seconds, not sure if SD-lib is using the 'yield' function.

Other than that, it goes well....
 
Last edited:
- The EventResponder sounds interesting, where do I find more information? Shawn?
- I don't feel comfortable implementing my own 'yield' function, as there is a risk of infinite loop, and thus stackoverflow...
- I found SD.mediaPresent() function, so my second issue (3 sec) is solved.
 
Last edited:
yield() is weak and easy to replace in sketch.
It is more efficient these days. It mainly polls any present serialEvent() functions and then returns. It doesn't actually delay() or waste much time before returning.

If no serialEvent() code is used - and not using Event Responder - then replacing with an empty one is easy and might show it isn't related:
void yield() {} // in sketch replaces the Arduino standard for checking serialEvent()'s
 
Thanks @defragster, I didn't know about serialEvent; as T4.1 has 8 UARTs, does it mean we have serialEvent1..8() functions?
How do I add (my custom) EventResponder to my sketch?
 
Interesting @defragster, I'm more interested in the EventResponder functionality.
I dig into some forum, and try to make sense of the information I could find on EventResponder.
The outcome is a minimalistic example or demonstrator which I posted here: (my way of giving back to this great community!)
It is also how I intend to use it in my project; and thus avoiding re-implement the 'yield' function in my own project-code.
 
Back
Top