sin, sinf, cos, cosf - strange timing results on teensy 4

Frank B

Senior Member
Code:
extern "C"
{
  void sincosf(float err, float *s, float *c);
}

void setup() {
 long m;
 float res;
 
 res = 0;
 m = micros();
 for (int i=0;i<100000;i++) {
  float s =sin(i);
  float r =cos(i);
  res += s * r;
 }
 asm volatile ("dsb":::"memory");
 m = micros() -m;
 Serial.println("sin():");
 Serial.println(res);
 Serial.println(m);


 res = 0;
 m = micros();
 for (int i=0;i<100000;i++) {
  float s =sinf(i);
  float r =cosf(i);
  res += s * r;
 }
 asm volatile ("dsb":::"memory");
 m = micros() -m;
 Serial.println("sinf():");
 Serial.println(res);
 Serial.println(m);


 res = 0;
 m = micros();
  for (int i=0;i<100000;i++) {
  float Sin, Cos;
  sincosf(i, &Sin, &Cos);  
  res += Sin * Cos;
 }
 asm volatile ("dsb":::"memory");
 m = micros() -m;
 Serial.println("SinCosf:");
 Serial.println(res);
 Serial.println(m);
}

void loop() {
}

..one could assume, float x = sinf(float) is faster than float x = sin(float).
Nope.
sin(): 95874 microseconds
sinf(): 662325 microseconds
Why ??? Can you please fix this?
 
Some time ago I did a T3.2 project which involved a PID regulating the angles between two motors. Since I assumed that the trig functions will do some modulo 2PI internally I didn't bother avoiding large arguments. At the end I found out the hard way that the trig functions get extremely inefficient for large arguments.

Seeing your post I remembered this and did a quick experiment to find out what happens if one reduces the max angle. Here my test code. I manually increased maxPhi from 1 to 800 and plotted the results

Code:
void setup()
{
  while(!Serial);

  long m;
  float res;
  float phi;
  constexpr unsigned loops = 100'000;
  constexpr float maxPhi = 300.0;
  constexpr float dPhi = maxPhi / loops;

  phi = 0;
  res = 0;
  m = micros();
  for (unsigned i = 0; i < loops; i+=1)
  {
    float s = sin(phi);
    float r = cos(phi);
    res += s * r;
    phi += dPhi;
  }
  asm volatile("dsb" ::: "memory");
  m = micros() - m;
  Serial.println("sin():");
  Serial.println(res);
  Serial.println(m);

  phi = 0;
  res = 0;
  m = micros();
  for (unsigned i = 0; i < loops; i++)
  {
    float s = sinf(phi);
    float r = cosf(phi);
    res += s * r;
    phi += dPhi;
  }
  asm volatile("dsb" ::: "memory");
  m = micros() - m;
  Serial.println("sinf():");
  Serial.println(res);
  Serial.println(m);
}

void loop()
{
}


Results:
Code:
maxPhi  double  float
1	45461	26249
2	52708	31658
3	58923	36152
5	65796	41319
10	70930	45083
20	73458	47035
50	74956	48188
100	76304	50921
200	76996	53987
300	77272	230641
400	77418	320156
800	77712	455706

Graph
sinf.jpg

For maxPhi below some 250rad everything is fine. Then the efficiency of the sinf algorithm decreases dramatically. So, looks like the GCC team improved the sinf algorithm sometime in between the compiler versions.

Is there a way to fix this without installing a new compiler?
You can always make sure to do a modulo 2PI to the argument, that should fix it.
 
That looks much better :) Thank you very much.

Unfortunately, it does not help me much. I'm debugging a really very very large program, which uses a large amount of all kinds of trigonometry functions.
I'm sure it does not use large arguments.
So, obviously, my testprogram above does not reflect the real problem i'm seeing, and I have to reproduce the problem more exactly.

Or, I just use GCC9.
 
Stay cool, no need to hurry.

Not related: Maybe, in a beta for 1.5x (and only in the beta), you can add -Wdouble-promotion ?

Thanks, Frank.


@Others: If you _really_ need it, there is a short guide in the WIKI that lists the necessary steps to upgrade the toolchain.
 
Back
Top