sin, sinf, cos, cosf - strange timing results on teensy 4

Frank B · Apr 5, 2020

Code:

extern "C"
{
  void sincosf(float err, float *s, float *c);
}

void setup() {
 long m;
 float res;
 
 res = 0;
 m = micros();
 for (int i=0;i<100000;i++) {
  float s =sin(i);
  float r =cos(i);
  res += s * r;
 }
 asm volatile ("dsb":::"memory");
 m = micros() -m;
 Serial.println("sin():");
 Serial.println(res);
 Serial.println(m);


 res = 0;
 m = micros();
 for (int i=0;i<100000;i++) {
  float s =sinf(i);
  float r =cosf(i);
  res += s * r;
 }
 asm volatile ("dsb":::"memory");
 m = micros() -m;
 Serial.println("sinf():");
 Serial.println(res);
 Serial.println(m);


 res = 0;
 m = micros();
  for (int i=0;i<100000;i++) {
  float Sin, Cos;
  sincosf(i, &Sin, &Cos);  
  res += Sin * Cos;
 }
 asm volatile ("dsb":::"memory");
 m = micros() -m;
 Serial.println("SinCosf:");
 Serial.println(res);
 Serial.println(m);
}

void loop() {
}

..one could assume, float x = sinf(float) is faster than float x = sin(float).
Nope.
sin(): 95874 microseconds
sinf(): 662325 microseconds
Why ??? Can you please fix this?

Frank B · Apr 5, 2020

...to answer my question... it's the ancient compiler (or libc)
With GCC 9, all is correct.

Frank B · Apr 6, 2020

Frank B said:
...to answer my question... it's the ancient compiler (or libc)
With GCC 9, all is correct.

Is there a way to fix this without installing a new compiler?

luni · Apr 6, 2020

Some time ago I did a T3.2 project which involved a PID regulating the angles between two motors. Since I assumed that the trig functions will do some modulo 2PI internally I didn't bother avoiding large arguments. At the end I found out the hard way that the trig functions get extremely inefficient for large arguments.

Seeing your post I remembered this and did a quick experiment to find out what happens if one reduces the max angle. Here my test code. I manually increased maxPhi from 1 to 800 and plotted the results

Code:

void setup()
{
  while(!Serial);

  long m;
  float res;
  float phi;
  constexpr unsigned loops = 100'000;
  constexpr float maxPhi = 300.0;
  constexpr float dPhi = maxPhi / loops;

  phi = 0;
  res = 0;
  m = micros();
  for (unsigned i = 0; i < loops; i+=1)
  {
    float s = sin(phi);
    float r = cos(phi);
    res += s * r;
    phi += dPhi;
  }
  asm volatile("dsb" ::: "memory");
  m = micros() - m;
  Serial.println("sin():");
  Serial.println(res);
  Serial.println(m);

  phi = 0;
  res = 0;
  m = micros();
  for (unsigned i = 0; i < loops; i++)
  {
    float s = sinf(phi);
    float r = cosf(phi);
    res += s * r;
    phi += dPhi;
  }
  asm volatile("dsb" ::: "memory");
  m = micros() - m;
  Serial.println("sinf():");
  Serial.println(res);
  Serial.println(m);
}

void loop()
{
}

Results:

Code:

maxPhi  double  float
1	45461	26249
2	52708	31658
3	58923	36152
5	65796	41319
10	70930	45083
20	73458	47035
50	74956	48188
100	76304	50921
200	76996	53987
300	77272	230641
400	77418	320156
800	77712	455706

Graph

For maxPhi below some 250rad everything is fine. Then the efficiency of the sinf algorithm decreases dramatically. So, looks like the GCC team improved the sinf algorithm sometime in between the compiler versions.

Is there a way to fix this without installing a new compiler?

You can always make sure to do a modulo 2PI to the argument, that should fix it.

Frank B · Apr 6, 2020

That looks much better

Thank you very much.

Unfortunately, it does not help me much. I'm debugging a really very very large program, which uses a large amount of all kinds of trigonometry functions.
I'm sure it does not use large arguments.
So, obviously, my testprogram above does not reflect the real problem i'm seeing, and I have to reproduce the problem more exactly.

Or, I just use GCC9.

PaulStoffregen · Apr 6, 2020

Frank B said:
With GCC 9, all is correct.

I want to attempt a toolchain about 1 month after Teensy 4.1 is released.

Frank B · Apr 6, 2020

Stay cool, no need to hurry.

Not related: Maybe, in a beta for 1.5x (and only in the beta), you can add -Wdouble-promotion ?

Thanks, Frank.

@Others: If you _really_ need it, there is a short guide in the WIKI that lists the necessary steps to upgrade the toolchain.

HallMark · Apr 9, 2020

@Others: If you _really_ need it, there is a short guide in the WIKI that lists the necessary steps to upgrade the toolchain.

@Frank: Can you share WIKI link?

luni · Apr 9, 2020

https://github.com/TeensyUser/doc/wiki/GCC

sin, sinf, cos, cosf - strange timing results on teensy 4

Frank B

Senior Member

Frank B

Senior Member

Frank B

Senior Member

luni

Well-known member

Frank B

Senior Member

PaulStoffregen

Well-known member

Frank B

Senior Member

HallMark

Well-known member

luni

Well-known member