Lightweight C++ callbacks

To use captured lambdas attachInterupt would need to have a std::function API. Fortunately this is not difficult to achieve. Copy the attached files to your sketch folder then this will work as intended:
Code:
#include "attachInterruptEx.h"

void setup()
{
    while(!Serial){}

    for (int pin = 2; pin <= 9; pin++)
    {
        pinMode(pin, INPUT_PULLUP);
        attachInterruptEx(pin, [pin] { Serial.println(pin); }, FALLING); // capture a copy of pin 
    }
}

void loop(){}

See https://github.com/luni64/TeensyHelpers for more examples
 

Attachments

  • attachInterruptEx.cpp
    1.1 KB · Views: 59
  • attachInterruptEx.h
    128 bytes · Views: 51
Sure, I don't propose to do this on the normal attachInterrupt. But in case someone needs it one can just use attachInterruptEx(). It peacefully coexists with the standard version.

Did you find time to look at the proposed callback system in #23?
 
Tried a quick measurement just now, and I must admit, the added overhead isn't nearly as much as I has expected.

Looks like attachInterrupt() is taking 165ns and attachInterruptEx() takes 174ns.

Measurement was from rising edge input to observed pin change by digitalToggleFast(), so probably includes a few instructions to set up registers.

file.png
 
Last edited:
I didn't expect much performance penalty but only 6 cycles is impressive. It also includes the additional indirection introduced by attachInteruptEx which adds a relay function which then calls the user callback. Those STL guys defintely know their trade.

Drawback: the needed <functional> header it is a bit expensive memory wise (some 10kB).
 
Tried a quick measurement just now, and I must admit, the added overhead isn't nearly as much as I has expected.

Looks like attachInterrupt() is taking 165ns and attachInterruptEx() takes 174ns.

Measurement was from rising edge input to observed pin change by digitalToggleFast(), so probably includes a few instructions to set up registers.

View attachment 29260

Thanks for the analysis and the quick response to my "quick and dirty" question.
 
I didn't expect much performance penalty but only 6 cycles is impressive. It also includes the additional indirection introduced by attachInteruptEx which adds a relay function which then calls the user callback. Those STL guys defintely know their trade.

Drawback: the needed <functional> header it is a bit expensive memory wise (some 10kB).

Thanks.

Given that I still use some LC's that is something to consider. I hope it's flash memory and not RAM?
 
Thanks.

Given that I still use some LC's that is something to consider. I hope it's flash memory and not RAM?

It is both, best to give it a try. IIRC the LC is using nanoLib instead of newLib which brings down the memory requirements significantly
 
@PaulStoffregen In case you are still interested: I meanwhile got capturing lambdas going as well without using std::function or dynamic memory of course :). Looks like everything possible for std::function callbacks is possible with my callback helper. Syntax is the same as for std::function callbacks. Still experimental but I'm confident that it could be made into a robust tool. Code and examples can be found here: https://github.com/luni64/cb

Here the current user API (using the simple PIT class from #21 as test implementation)
Code:
 // use free function callback
 timer.begin(onTimer, 250'000);

 // use lambda to attach member function as callback
 timer.begin([] { test.myFunc2(); }, 250'000);

 // use member function pointer and instance to attach callback. Syntax won't get nicer. "test.myFunc1" is not possible in c++ (maybe with some makro if really necessary) 
 timer.begin(&Test::myFunc1, &test, 250'000);

 // attach non capturing lambda as callback
 timer.begin([] { Serial.println("called lambda"); }, 250'000);

 // attach capturing lambda as callback
 // this is especially useful for embedding the callback provider (e.g. IntervalTimer) in a user class without the static trick. 
 int n = 42;
 timer.begin([n] { Serial.println(n); }, 500'000 );

Quick explanation:
For each lambda the compiler basically generates an anonymous class with an operator(). The operator() executes the lambda code (the code between the braces). The captured variables (those between the brackets) end up as members of this class. Therefore, the size of the autogenerated class equals the size of the captured variables.

If you want to store the generated class (i.e. the lambda), e.g. for later use as callback, you'd normaly new it up on the heap which doesn't require to know its size beforehand (I assume this is why std::function uses dynamic memory). If we want to avoid dynamic memory allocation, we can statically preallocate some buffer and use `placement new`instead of `new`to construct the object in this buffer. Of course the size of the preallocated buffer sets a limit on how much variables can be captured per callback. Usually you'd capture some numeric values or the 'this' pointer only. I configured the code to reserve 16 bytes for lambda parameters per callback. It can be increased to any value if one is willing to accept the increased memory footprint. Good thing is, the code will generate a compiletime error if the user captures too much parameters

The whole thing is encapsulated in a CallbackHelper class which handles all of the details. A library writer can use the high level CallbackHelper API to generate callbacks from the various passed in types (function pointers, lambdas etc). He doesn't need to mess with the detais. See the PIT class for an example.

Let me know if you actually want to use it and need a robust version with destructors and error handling. Otherwise I'd call it a very nice learning project and stop here.
 
Last edited:
There is some news on this:
I meanwhile found some very interesting code in a (german) c++ forum https://www.c-plusplus.net/forum/to...chere-eigene-variante-ersetzen-signal-slot/17, which shows how to implement a tiny drop in replacement of std::function which doesn't use dynamic memory allocation. Like in my CallbackHelper, it stores the passed in objects (function pointes, lambdas, functors...) in a preallocated buffer. I defaulted the buffer size to 16 byte which is plenty, but it can be adjusted from user code if necessary.

I had to tweak the code a bit to compile with std::c++11, but now it is pretty generic and works with a lot of boards (XIAO, Nucleo, ESP32, SAMD...). However, it doesn't compile with the old GCC5.4 (misses some utilities). So, to try it on a Teensy one currently needs to use Teensyduino beta version. Here the link to the library: https://github.com/luni64/staticFunctional (actually it is a one header only library. So, just copy the file "staticFunctional.h" to your sketch folder and you are good to go)

The repo contains some examples, and, just for the fun of it a reworked version of the IntervalTimer. Adopting IntervalTimer only required to change the declarations of the begin(...) functions from the explicit begin(void(*func)(), ....) to begin(callback_t func,....). callback_t is defined by "using callback_t = function<void(void)>;" That was basically all I needed to change. It shouldn't break any existing code since it still accepts void(*)() callbacks.

Here a simple usage example for the library.
Code:
#include "staticFunctional.h"
using namespace staticFunctional; // save typing

class MyClass
{
 public:
    void nonStaticMemberFunction() {
        Serial.printf("nonStaticMemberFunction i=%d\n", i);
    }

    void operator()() const {
        Serial.printf("functor 2i=%d\n", 2*i);
    }
    int i = 42;
};
MyClass myClass;

void freeFunc(){
    Serial.println("free function");
}

void setup()
{
    while (!Serial) {}

    function<void(void)> f; // function taking no arguments and returning nothing

    f = freeFunc;
    f();

    f = [] { Serial.println("lambda expression"); };
    f();

    f = [] { myClass.nonStaticMemberFunction(); }; // non static member function
    f();

    f = myClass; // functor, f uses operator ()()
    f();
}

void loop(){
}

// prints:
// free function
// lambda expression
// nonStaticMemberFunction i=42
// functor 2i=84
 
There is some news on this:

The repo contains some examples, and, just for the fun of it a reworked version of the IntervalTimer. Adopting IntervalTimer only required to change the declarations of the begin(...) functions from the explicit begin(void(*func)(), ....) to begin(callback_t func,....). callback_t is defined by "using callback_t = function<void(void)>;" That was basically all I needed to change. It shouldn't break any existing code since it still accepts void(*)() callbacks.
This is great stuff!

I cannot understand why PaulS was engaged enough to start this whole thing up, and to jump in to explain loop capture in response to a side comment, yet apparently never looked at serious potential solutions like this, the earlier POC, and the well-established etlcpp, or at least didn't bother commenting on any of them.
 
Discussed on other threads, the rough plan is to release 1.58 with the gcc 5.4.1 -> 11.3.1 update, then in 1.59 development switch to C++17 dialect. Publishing both a new toolchain and C++ dialect change for the same stable release is considered too risky. This special callback class stuff will move forward when we're on C++17.
 
Now that we have 1.59 beta with updated toolchain and C++17, I'm looking at staticFunctional.

Luni, if you're still watching this thread, can you help me understand the memory usage? I see a pair of _emplace() functions which definitely use C++ new.

https://github.com/luni64/staticFunctional/blob/main/src/staticFunctional.h#L158

https://github.com/luni64/staticFunctional/blob/main/src/staticFunctional.h#L170

The move() and copy() functions inside InvokerBase also use C++ new. All the constructors that actually do something seem to call _emplace(), copy(), or move().

Your readme says no dynamic memory allocation. How is that? I just don't understand. Did I miss something?
 
This is using "placement new" which doesn't allocate memory but generates the object in the passed in buffer (parameter after the new keyword) which is statically allocated
https://stackoverflow.com/questions/222557/what-uses-are-there-for-placement-new

Anyway, I meanwhile prefer Inplace_Function since this was done by a c++ standardization group. See https://github.com/PaulStoffregen/cores/pull/656#issuecomment-1252907659
I also use this in the TimerTool and the EncoderTool

EDIT: https://github.com/WG21-SG14/SG14/blob/master/SG14/inplace_function.h
 
Last edited:
Here's what I'm doing in my programs (and in QNEthernet) to play around with different function types. Of course, I built this on top of @luni's very hard work. I simplified it just a little bit and did what I felt was something slightly more generic.

1. Instead of including <functional>, I include "function_t.h".

The contents of function_t.h:

Code:
#ifdef MYPROGRAM_USE_INPLACE_FUNCTION

#include "sg14/inplace_function.h"
template <class Signature>
using function_t = stdext::inplace_function<Signature>;

#else

#include <functional>
template <typename Signature>
using function_t = std::function<Signature>;

#endif  // MYPROGRAM_USE_INPLACE_FUNCTION

2. Just after the includes and before the namespace declaration in inplace_function.h, I add this:

Code:
#ifndef __cpp_exceptions

// Crash idea from smalloc_do_crash() in sm_util.c.
// Loop forever from startup.c.
[[noreturn]] static void crash() {
  char *c = nullptr;
  *c = 'X';
  while (1) asm("WFI");  // Is this really necessary if there's been a crash?
                         // Does the compiler prefer this line here?
                         // Would just a `while (1) {}` be more appropriate?
}

#define SG14_INPLACE_FUNCTION_THROW(x) crash()

#endif  // !__cpp_exceptions

3. Replace all `std::function` tokens with `function_t`. (It might be a good idea to put it in its own namespace.)

This saves a little over 38.5k in the flash and a little over 4.5k in RAM1. (For my specific test program.)

I'm sure there's more improvements that could be made; this is just my initial approach. (On top of @luni's large conceptual shoulders.)
 
Last edited:
Oh great, that's more or less the same I do in the TimerTool and the EncoderTool. I added some stuff to make it compile for older toolchains but this should not be necessary here. I'll try your crash function. Looks better than mine :)
Do you see a chance to make it work without adopting the inplace_function.h header? I tried but didn't succeed.

This saves a little over 38.5k in the flash and a little over 4.5k in RAM1. (For my specific test program.)
I assume this is comparing <functional> to <inplace_function> right? This is about the same saving I saw. BTW: If you switch from newlib (the c-lib used for T3.x and T4.x by default) to nanolib (default for T-LC), <functional> also shrinks by about this amount.
 
Would be good to hook 'static void crash()' to PJRC's crashreporting for user feedback on restart?

"WFI" on T_4.x can sleep the CPU when there are no active interrupts - the millis timer doesn't count. Not sure if that precludes USB control?

If tied to CrashReport then normal 8 second wait/restart with 'status' could print - but report info would need to be hand crafted given it isn't a 'fault' but reported 'exception'.
 
Do you see a chance to make it work without adopting the inplace_function.h header? I tried but didn't succeed.

Can you explain what you mean? I don't understand the question. :)

I assume this is comparing <functional> to <inplace_function> right? This is about the same saving I saw.

Yes. (But with inplace_function.h".)
 
There might be a better way to work with exceptions and the CrashReport. I was experimenting with calling an uninitialized `std::function<void()>`, and the terminal printed this:

Code:
terminate called after throwing an instance of 'std::bad_function_call'
terminate called recursively

The code:

Code:
#include <functional>

void setup() {
  Serial.begin(115200);
  delay(2000);

  std::function<void()> f;
  f();
}

void loop() {
}

I poked around and found, in <bits/std_function.h>, that it calls std::__throw_bad_function_call() when calling an uninitialized function object. There's a bunch of non-exception equivalents for exceptions (and probably in other places too). So I replaced `crash()` with this:

Code:
#define SG14_INPLACE_FUNCTION_THROW(x) std::__throw_bad_function_call()

Maybe if there's a way to hook into how all these "__throw_XXX()" functions call some internal terminate(), there'd be a nicer way to integrate with CrashReport. I haven't looked too deeply; just thinking aloud. Also, this might be just for GCC.
 
Interesting, those __throw_XXX functions used to never work correctly. See e.g. https://forum.pjrc.com/threads/5719...-STL-libraries?p=224383&viewfull=1#post224383. Maybe your work on supporting the standard printf function made them work now?

Does inplace_function "throw" if the function object is too large to be stored in its statically allocated memory (32bytes by default)? You should be able provoke this by storing a lambda where the captured parameters need more than 32 bytes storage.

(std::function statically allocates a very small storage (4 bytes?) for this. If the stored object requires more, it dynamically allocates the required memory)
 
Just gave it a try, inplace_function generates a compile error in case of a too large function object which is even better than "throwing" at runtime...
 
Interesting, those __throw_XXX functions used to never work correctly. See e.g. https://forum.pjrc.com/threads/5719...-STL-libraries?p=224383&viewfull=1#post224383. Maybe your work on supporting the standard printf function made them work now?

Maybe it's the new toolchain. I had forgotten about that post... :) Maybe that's why it works now, because I didn't know (or, didn't remember) that it didn't work. :p

Does inplace_function "throw" if the function object is too large to be stored in its statically allocated memory (32bytes by default)? You should be able provoke this by storing a lambda where the captured parameters need more than 32 bytes storage.

I get a compile error if the parameters are too large. Example code:

Code:
#include "inplace_function.h"

struct big {
  uint8_t b[33]{0};
};

void setup() {
  Serial.begin(115200);
  delay(2000);

  big b;
  stdext::inplace_function<void()> f = [b]{  // Copied capture
    printf("%u\r\n", b.b[0]);
  };
  f();
}

void loop() {
}

The error:

Code:
.../TestNullFunction/inplace_function.h: In instantiation of 'stdext::inplace_function<R(Args ...), Capacity, Alignment>::inplace_function(T&&) [with T = setup()::<lambda()>; C = setup()::<lambda()>; <template-parameter-2-3> = void; R = void; Args = {}; unsigned int Capacity = 32; unsigned int Alignment = 8]':
.../TestNullFunction/TestNullFunction.ino:26:3:   required from here
.../TestNullFunction/inplace_function.h:241:33: error: static assertion failed: inplace_function cannot be constructed from object with this (large) size
  241 |         static_assert(sizeof(C) <= Capacity,
      |                       ~~~~~~~~~~^~~~~~~~~~~
.../TestNullFunction/inplace_function.h:241:33: note: '(sizeof (setup()::<lambda()>) <= 32)' evaluates to false

Note: The line numbers won't necessarily match what you have.

Changing the declaration to this fixes it:

Code:
  stdext::inplace_function<void(), sizeof(big)> f = [b]{

This discussion gives some more clarity:
https://github.com/WG21-SG14/SG14/issues/160#issuecomment-883529379
 
The more I play with it the more I like it :)

However, the unexpected writing of std::function (and probably other STL facilities as well) to the serial port in case of an error is kind of scary. One never knows what is connected to this port... So, I'd rather stick with the provoked crash for stdext::inplace_function.
 
The more I play with it the more I like it :)

However, the unexpected writing of std::function (and probably other STL facilities as well) to the serial port in case of an error is kind of scary. One never knows what is connected to this port... So, I'd rather stick with the provoked crash for stdext::inplace_function.

I think that's just how `std::terminate()` works. Here's the output if I call `std::terminate()` (from <exception>) myself:

Code:
terminate called without an active exception

I'm not certain, but I have a feeling there's a way to override that.
 
Back
Top