Double precision floating point for Teensy 3.2

paulhat · Mar 20, 2022

Hi All,

I've found what appear to be answers to this question but I just need a bit of help.
Struggling to implement a solution.

I am running a Teensy 3.2 and trying to convert a long integer into a double precision floating point.
My understanding is that the Teensy 3.2 uses software emulation for standard and double floats, and that by default, all floats are compiled as 32 bit.
If you want to declare a constant as a double float you have to put an L or I after it.

I am reading latitude from an RTK gps unit. It comes back as a long with the degrees then 7 decimal places.
For example 22.1234567 would come back as 221234567.

If I run the following code I get

Code:

 22.1234570

as the result. It looks like the value is being rounded and losing precision.

Code:

   float test=0.0L;

   long latitude = 221234567;
   test = abs(latitude/10000000.0L);
   Serial.printf( "%12.7lf",test );

I don't think the declaration of test

Code:

float test=0.0L;

is the correct way to declare test as double precision.
Same with dividing by the constant which is declared as double precision

Code:

test = abs(latitude/10000000.0L);

.

Any ideas on how to do this properly?

Paul.

PaulStoffregen · Mar 20, 2022

paulhat said:
I don't think the declaration of test ... is the correct way to declare test as double precision.

Yup. In C++, use "double" rather than "float", like this:

Code:

void setup() {
  Serial.begin(9600);
  while (!Serial) ; // wait for Arduino Serial Monitor
  long latitude = 221234567;
  [B][COLOR="#B22222"]double[/COLOR][/B] test = abs(latitude / 10000000.0L);
  Serial.printf( "%12.7lf", test);
  Serial.println();
  test = 12.345678901234L;
  Serial.printf( "%17.12lf", test);
}

void loop() {
}

joepasquariello · Mar 20, 2022

PaulStoffregen said:

Yup. In C++, use "double" rather than "float", like this:

Code:

void setup() {
  Serial.begin(9600);
  while (!Serial) ; // wait for Arduino Serial Monitor
  long latitude = 221234567;
  [B][COLOR="#B22222"]double[/COLOR][/B] test = abs(latitude / 10000000.0L);
  Serial.printf( "%12.7lf", test);
  Serial.println();
  test = 12.345678901234L;
  Serial.printf( "%17.12lf", test);
}

void loop() {
}

@PaulStoffregen, Paul, I was familiar with the use of suffixes "L", "UL", "LL", and "ULL" for integers in C and C++, but not with "L" for floating-point values. I did some searching, and what I'm reading says that "L" specifies long double, as opposed to double. So, "f" suffix = float, no suffix = double, and "L" suffix = long double. If that's correct, the two uses of the "L" suffix in your test program should be omitted.

PaulStoffregen · Mar 20, 2022

Yes, that is true normally.

But on Teensy 3.2, 3.5, 3.6, LC we compile with gcc's optional flag -fsingle-precision-constant, which changes things slightly. In Arduino, you can see the exact compiler commands with "verbose output during compilation" in File > Preferences.

We use -fsingle-precision-constant because Teensy 3.5 & 3.6 have hardware FPU for 32 bit float, but not 64 bit double. And on Teensy LC & 3.2, software 32 bit float is so much faster than 64 bits. While behavior deviates from the C / C++ standards, it's considered a good trade off because the Arduino ecosystem is filled with casually written code originally intended for AVR where both float and double are 32 bits. Without this flag, virtually all commonly found Arduino code promotes all float math to using 64 bits, even when the final result gets stored into a 32 bit float. On Teensy the long double type is implemented as 64 bits, exactly the same as normal double.

defragster · Mar 20, 2022

IIRC - the build system is set to use FLOAT unless "L" is specified as IN Paul's example - at least on T_3.x where defaulting to "L" is generally not desired with added overhead.

,,, Paul for the win ... but the recall was correct ...

MichaelMeissner · Mar 20, 2022

defragster said:
IIRC - the build system is set to use FLOAT unless "L" is specified as IN Paul's example - at least on T_3.x where defaulting to "L" is generally not desired with added overhead.

,,, Paul for the win ... but the recall was correct ...

Technically, using "L" as a suffix makes the constant 'long double'. Fortunately for the usage, on current ARM compilers, 'long double' has the same representation as 'double'. But on other systems, 'long double' is a different type.

For example, by default the x86_64 compilers, 'long double' is 80-bits, though in storage, it is padded out to 128-bits. This floating point format is based on the 8087 floating point co-processor that was an add-on to the original 386 processor and the current x86_64 processors still support it. There is an option to use IEEE 128-bit floating point instead of the hardware 80-bit format. On current x86_64 hardware, IEEE 128-bit floating point is emulated in software.

Also, the PowerPC (that I work on) has two different 128-bit formats. The legacy format (from the IBM AIX operating system) is a pair of doubles that gives you more mantissa range, but the exponents are still the same. The new format that I've been working on and off for the last 6 years uses the IEEE 128-bit floating format, which gives you both larger exponents and larger mantissas.. Starting with the Power9 servers, the hardware now has support for IEEE 128-bit floating point format, and we are beginning of having Linux distributions switch to using IEEE 128-bit format instead of the legacy format, but it still will take some years before everything is in place.

The original C standard defaulted to constants being 'double', and you had to use the "f" or "F" suffix to get an explicit single precision constant. You would need to use "l" or "L" to get an explicit 'long double' constant. Because 'double' is the default, there was no explicit suffix for explicit 'double' constants.

Going back to the original C compilers on the PDP-11 by Dennis Ritchie, many of the PDP-11 models had to switch from 64-bit support to do 32-bit support, and then switch back again. These ancient compilers would typically do everything in 'double', and the C language originally was biased to always promote 'float' to 'double'. The first C standard (ANSI standard in 1989 and ISO standard in 1990) changed this so that expressions only involving 'float' would not be required to convert things to 'double', but they could if desired (and some did).

But as Paul says, the Teensy build switches to using '-fsingle-precision-constant' due the Teensy 3.5/3.6 only having hardware support for 32-bit floating point. If you mix 'float' and 'double', the 'float' is converted to 'double', which is emulated in software (on the LC and 3.2, both 'float' and 'double' are emulated in software. On the Teensy 4.0/4.1/micromod, the hardware supports both 32-bit and 64-bit floating point in hardware and the '-fsingle-precision-constant' switch is not used.

joepasquariello · Mar 20, 2022

PaulStoffregen said:
Yes, that is true normally.

But on Teensy 3.2, 3.5, 3.6, LC we compile with gcc's optional flag -fsingle-precision-constant, which changes things slightly. In Arduino, you can see the exact compiler commands with "verbose output during compilation" in File > Preferences.

We use -fsingle-precision-constant because Teensy 3.5 & 3.6 have hardware FPU for 32 bit float, but not 64 bit double. And on Teensy LC & 3.2, software 32 bit float is so much faster than 64 bits. While behavior deviates from the C / C++ standards, it's considered a good trade off because the Arduino ecosystem is filled with casually written code originally intended for AVR where both float and double are 32 bits. Without this flag, virtually all commonly found Arduino code promotes all float math to using 64 bits, even when the final result gets stored into a 32 bit float. On Teensy the long double type is implemented as 64 bits, exactly the same as normal double.

Thanks, this is good info. I've been working on T3.5 recently, being careful to use the "f" suffix to be sure that all floating-point operations are 32-bit. This makes it a little tricky to plan for portability of code from T3.x to T4.x. I know I want 32-bit float operations on T3.x, but if I was doing the same things on T4.x, I might want to use 64-bit, if it's just as fast. In any case, I'm glad this topic came up.

paulhat · Mar 21, 2022

Hi All,

Thankyou for all the advice.
For me this felt like a big hurdle to overcome.
Thanks to the forum I was able to write a quick bit of code this afternoon that allowed me to test the accuracy (of maybe repeatabiltiy is a better word) of my GPS.
if I came within 20cm of the target a buzzer would sound. RTK gps is simply amazing. For less than a thousand bucks you get pretty incredible results. I reckon that I'm getting sub 4cm accuracy at worst.
No way I could have got this working without sorting the old double precision maths out.

Paul.

Double precision floating point for Teensy 3.2

paulhat

Member

PaulStoffregen

Well-known member

joepasquariello

Well-known member

PaulStoffregen

Well-known member

defragster

Senior Member+

MichaelMeissner

Senior Member+

joepasquariello

Well-known member

paulhat

Member