Yield and SerialEvent... and when internal functions should call yield?

Status
Not open for further replies.

KurtE

Senior Member+
And now for another random idea ;)

Yesterday I was playing around with SPI test case where I have a new member that once I launch an async SPI request, I can wait for it to complete. With this I was testing a flag and if not done call yield... And I was getting longer gaps of time than I would like, so I replaced the default yield in my sketch with:
Code:
void yield() {
}

So was wondering in cases like this should internal APIs call yield? Or should just loop or ...

Which made me wonder if it would make sense to maybe update the yield function... Right now on a Teensy 3.6/5 it will do a call to Serial.available for all 6 possible usarts and if there is anything on them will then call serialEvent... for each of these.

So if your code does not actually use the serialEvent construct it may be doing a lot of calls...

What I am wondering is can I automatically short circuit it? One idea that I am thinking of doing is:

1) create a global bitmask variable: maybe something like: uint8_t g_serial_events_active = 0;

2) Then update all of the Serial lass code, that when the user calls Serial.begin() maybe set bit 0 in the variable, if you call Serial1.begin() maybe set bit 1...

3) In yield code: change lines like: to something like: if ((Serial.available()) serialEvent();
to something like: if ((g_serial_events_active & 0x1) && Serial.available()) serialEvent();

4) change the default place holder SerialEvent implementations, to turn off their g_serial_events_active bit
like: void serialEvent1() {g_serial_events_active &= ~0x2;}

5) at the start of yield check to see if there are any handlers. So yield might look like:
Code:
void yield(void)
{
	static uint8_t running=0;
	if (!g_serial_events_active || running) return; // TODO: does this need to be atomic?
	running = 1;
	if ((g_serial_events_active & 0x01) && Serial.available()) serialEvent();
	if ((g_serial_events_active & 0x02) && Serial1.available()) serialEvent1();
	if ((g_serial_events_active & 0x04) && Serial2.available()) serialEvent2();
	if ((g_serial_events_active & 0x08) && Serial3.available()) serialEvent3();
#ifdef HAS_KINETISK_UART3
	if ((g_serial_events_active & 0x10) && Serial4.available()) serialEvent4();
#endif
#ifdef HAS_KINETISK_UART4
	if ((g_serial_events_active & 0x20) && Serial5.available()) serialEvent5();
#endif
#if defined(HAS_KINETISK_UART5) || defined (HAS_KINETISK_LPUART0)
	if ((g_serial_events_active & 0x40) && Serial6.available()) serialEvent6();
#endif
	running = 0;
};

So in theory if you don't use serialEvent and after any serial port that you use receives it's first character, the yield function will end up just begin a quick test of one variable and a return.

Thoughts?
 
I did a quick try - not as elaborate - of making these conditional and it didn't help - probably quit too soon. I'm not sure how quick the Serial#.available() tests are - I got the idea they were FAST - but still a context switch and cache dumping waste of focus.

Not only does the device need to be 'in use' - but it only has value if serialEvent#() is actually in user code and this code is replaced:
void serialEvent5() __attribute__((weak));
void serialEvent5() {}
 
Thanks defragster,

The SerialX.available() calls are a virtual function call, so one call, plus internal to it, it calls a function like: serial_available so a second call. Which is pretty simple of doing some simple grab memory, compare a couple of values and either a simple subtraction or an addition and subtraction...

So checking the flag should short circuit it.

As you mention calling serialEvent# is only useful if it does something. That is why I suggested having the default implementation, try to turn itself off...
 
The SerialX.available() calls are a virtual function call, so one call, plus internal to it, it calls a function like: serial_available so a second call.
No, it's not a virtual call:
Code:
yield:
    1e66:	f7ff fe2b 	bl	1ac0 <usb_seremu_available>
    1e6a:	b9a0      	cbnz	r0, 1e96 <yield+0x3e>
    1e6c:	f000 f96c 	bl	2148 <serial_available>
    1e70:	b9b8      	cbnz	r0, 1ea2 <yield+0x4a>
    1e72:	f000 f9ff 	bl	2274 <serial2_available>
    1e76:	b9d0      	cbnz	r0, 1eae <yield+0x56>
    1e78:	f000 fa92 	bl	23a0 <serial3_available>
    1e7c:	b9e8      	cbnz	r0, 1eba <yield+0x62>
    1e7e:	f000 fb0b 	bl	2498 <serial4_available>
    1e82:	bb00      	cbnz	r0, 1ec6 <yield+0x6e>
    1e84:	f000 fb84 	bl	2590 <serial5_available>
    1e88:	bb18      	cbnz	r0, 1ed2 <yield+0x7a>
    1e8a:	f000 fbfd 	bl	2688 <serial6_available>
    1e8e:	bb30      	cbnz	r0, 1ede <yield+0x86>

00002148 <serial_available>:
    2148:	4a05      	ldr	r2, [pc, #20]	; (2160 <serial_available+0x18>)
    214a:	4b06      	ldr	r3, [pc, #24]	; (2164 <serial_available+0x1c>)
    214c:	7810      	ldrb	r0, [r2, #0]
    214e:	781b      	ldrb	r3, [r3, #0]
    2150:	b2c0      	uxtb	r0, r0
    2152:	b2db      	uxtb	r3, r3
    2154:	4298      	cmp	r0, r3
    2156:	bf38      	it	cc
    2158:	3040      	addcc	r0, #64	; 0x40
    215a:	1ac0      	subs	r0, r0, r3
    215c:	4770      	bx	lr

Getting rid of the unused serial port code / buffers entirely would be a good.
 
Last edited:
Thanks defragster,
...
As you mention calling serialEvent# is only useful if it does something. That is why I suggested having the default implementation, try to turn itself off...

I wasn't trying to restate the obvious - but to inspire someone with knowledge to say if the compile/link process could take this out when the 'weak' code isn't replaced.
 
No, it's not a virtual call:

Getting rid of the unused serial port code / buffers entirely would be a good. yield() drags in all the serial port code, even if it is not used outside of yield(). (GCC doesn't eliminate dead code, if there are virtual functions.)

Thanks, maybe the optimizer got rid of the virtual part of it as you called specific one and does not reference this... I was simply going off the class definition... Example:
Code:
class HardwareSerial2 : public HardwareSerial
{
public:
	virtual void begin(uint32_t baud) { serial2_begin(BAUD2DIV2(baud)); }
	virtual void begin(uint32_t baud, uint32_t format) {
					  serial2_begin(BAUD2DIV2(baud));
					  serial2_format(format); }
	virtual void end(void)		{ serial2_end(); }
	virtual void transmitterEnable(uint8_t pin) { serial2_set_transmit_pin(pin); }
	virtual void setRX(uint8_t pin) { serial2_set_rx(pin); }
	virtual void setTX(uint8_t pin, bool opendrain=false) { serial2_set_tx(pin, opendrain); }
	virtual bool attachRts(uint8_t pin) { return serial2_set_rts(pin); }
	virtual bool attachCts(uint8_t pin) { return serial2_set_cts(pin); }
	virtual int available(void)     { return serial2_available(); }
...
};

I wasn't trying to restate the obvious - but to inspire someone with knowledge to say if the compile/link process could take this out when the 'weak' code isn't replaced.
That would be great, if it somehow could detect that a week code was not replaced that it could remove a section of code...
 
This looks like 2 separate issues. The simpler one is my not-so-great structure for the hardware serial code. Apparently the weak symbols are only effective if referenced from the same source file, or "compile unit" in linker lingo. Years ago I didn't think of this when writing the code, and then later needing to reference it from yield and the fault handler. Someday it needs to be restructured to a single file, similar to how Arduino has theirs, so the weak symbol stuff actually works and the serial code doesn't get linked when it's not actually used. Especially on Teensy LC this would free up quite a lot of the limited memory.

The harder decision is what to do about yield() and event callbacks. I agree with using a scheme like that bitmask to minimize the time. I've long been considering adding more event callbacks (perhaps *many* more in the long run), which would almost certainly take too much time if they were all checked on every yield() call.
 
This looks like 2 separate issues. The simpler one is my not-so-great structure for the hardware serial code. Apparently the weak symbols are only effective if referenced from the same source file, or "compile unit" in linker lingo. Years ago I didn't think of this when writing the code, and then later needing to reference it from yield and the fault handler. Someday it needs to be restructured to a single file, similar to how Arduino has theirs, so the weak symbol stuff actually works and the serial code doesn't get linked when it's not actually used. Especially on Teensy LC this would free up quite a lot of the limited memory.
With this change, the unused serial stuff doesn't get dragged in by yield (everything besides available() itself is eliminated):
https://github.com/PaulStoffregen/cores/pull/232

Teensy LC blink, Print / Stream without constexpr:

Sketch uses 12144 bytes (19%) of program storage space. Maximum is 63488 bytes.
Global variables use 2216 bytes (27%) of dynamic memory, leaving 5976 bytes for local variables. Maximum is 8192 bytes.


with https://github.com/PaulStoffregen/cores/pull/232:

Sketch uses 6396 bytes (10%) of program storage space. Maximum is 63488 bytes.
Global variables use 2132 bytes (26%) of dynamic memory, leaving 6060 bytes for local variables. Maximum is 8192 bytes.



Blink Teensy 3.6, Optimize faster, original:

Sketch uses 21404 bytes (2%) of program storage space. Maximum is 1048576 bytes.
Global variables use 4000 bytes (1%) of dynamic memory, leaving 258144 bytes for local variables. Maximum is 262144
bytes.

Constexpr Stream / Print

Sketch uses 9632 bytes (0%) of program storage space. Maximum is 1048576 bytes.
Global variables use 3872 bytes (1%) of dynamic memory, leaving 258272 bytes for local variables. Maximum is 262144 bytes.



It apparently allows GCC to get rid of all the virtual stuff.
 
Last edited:
Status
Not open for further replies.
Back
Top