I told you about this new information a long time ago :) But anyway, I'm glad that a newer toolchain is finally being used. It would be nice to see this for decades old components like the CMSIS or libraries, too.
@ JarkkoL
This, for a micro controller quite impressive piece of silicon, is however miles away from a modern desktop CPU. It is also helpful to simply compare the orders of magnitude lower number of transistors or...
Even with branch prediction, a register must be incremented or decremented and a comparison/jump must be performed. This is not without cost. (the jump takes 1 cycle if the prediction is correct. +1 cyle minimum for...
Well, this is a forum for teensy, and I would guess that it would run on a T4. (btw, the architecture is also known ;-), and is has no "no-overhead loops" like esp32)
Frankzappa, don't get confused! Normally the...
It will be not _that_ much faster, because the count of loops is important, too. an unrolled loop has much less jumps and compares. jumps are a bit expensive, too (not as much as a division)
But both together is ok :)
Divisions are slow. I have not tried it, but maybe it helps to multiply with 1.0f/max instead:
std::max(max, 10.0f);
max = 1.0f/max;
for (int i = 0; i < m_signalLength; i++) {
m_targetSignal *=...