I just change something and measure the elapsed micros to see which is faster. The unrolled was twice as fast so I kept it. It doesn’t matter what anyone says, if it’s faster it’s faster.
However the code before was...
Ah, I missed that. Thanks, will fix. It doesn’t really do anything because the signals are usually in the thousands. It’s just in case to not divide by zero but it’s pretty much impossible anyway.
You mean: ”std::max(max, 10.0f)” ?
That’s intentional for my use case. I don’t care about max values below 80 and 10 is just noise.
I used a regular for loop but it was way slower so I unrolled it manually. I...
However a big part of the optimisation is that I only process every other sensor reading as the requirements are not that great for this calculation. So every other time I process one sensor pair, and the next time I...
Sure:
float CrossCorrelation::ComputeHead(int sensorID) {
float max = 0.0f;
//unrolled loop, the "0" is incremented every time as in a for loop
m_targetSignal =...
I executed 10 of these calculations and printed the result, it's like 10 times faster lol.
Thanks for the link and the suggestion for the optimization. This knowledge will come in handy later when I have to do a lot...
Well I should be able to save like 1000 cycles per sensor pair and that is a lot. I'm doing this for 5 sensor pairs in between the sensor readings and I only have 5 microseconds of total time to calculate a sensor pair...
I will try it, it would explain why it is so slow. I was expecting a lot fewer cycles. I assumed it was all the comparisons for the loop. Unrolling the loop made it twice as fast but if this works and is much faster...
Really? In that case then it should make a big difference. I think I read in a post on the forum that it was 1 cycle for division/multiplication/addition/subtraction with floats and two cycles for doubles. I believe it...