# problem with double precision floating point constants

Status
Not open for further replies.

#### el_supremo

##### Well-known member
This is on Teensy 3/3.1[/edit]
About a year ago I wrote a program to parse algebraic equations and using that I wrote code to do numerical integration. I haven't used it for quite a while but today I found that it is giving me incorrect results whereas previously it was working fine. I eventually tracked the problem down to the way that double precision constants are stored using Arduino 1.6.5_r2 (with TeensyDuino 1.25 or 1.26).
A constant such as M_PI, which is defined in math.h as 3.14159265358979323846264338, should (by default) be compiled as double precision but in Arduino version 1.6.5_r2 it is obviously storing it as a float because it is printed as 3.1415927410125732 which is only correct to 7 places.
Here's some code which shows the problem using a function I modified to print double precision and using the floating point version of printf.
Code:
``````#include <math.h>

void test_printDoubleln(double number, uint8_t digits)
{
test_printDouble(number,digits);
Serial.println("");
}

void test_printDouble(double number, uint8_t digits)
{
uint8_t sign=0;

// Handle negative numbers
if (number < 0.0) {
sign = 1;
number = -number;
}

// Round correctly so that print(1.999, 2) prints as "2.00"
double rounding = 0.5;
for (uint8_t i=0; i<digits; ++i) {
rounding *= 0.1;
}
number += rounding;

// Extract the integer part of the number and print it
unsigned long int_part = (unsigned long)number;
double remainder = number - (double)int_part;

if(sign)Serial.print("-");
Serial.print(int_part);
// Print the decimal point, but only if there are still
// fractional digits to be printed
if (digits > 0) {
//>>> buf[8] is inadequate - use buf[20]
uint8_t n, buf[20], count=1;
buf[0] = '.';

// Extract digits from the remainder one at a time
if (digits > sizeof(buf) - 1) digits = sizeof(buf) - 1;

while (digits-- > 0) {
remainder *= 10.0;
n = (uint8_t)(remainder);
if(n > 9) {
buf[count++] = '?';
break;
}
buf[count++] = '0' + n;
remainder -= n;
}
Serial.write(buf,count);
}
}

void setup()
{
asm(".global _printf_float");

Serial.begin(9600);
while (!Serial);
delay(100);

Serial.println("Should be\n3.14159265358979323846264338");

// This is the big problem. This is rounded to a float
test_printDoubleln(M_PI,16);

// force the constant to be long double. This prints OK
test_printDoubleln(3.14159265358979323846264338L,16);
// this should be a double but in 1.6.5_r2 and later
// it obviously is rounded to a float
test_printDoubleln(3.14159265358979323846264338,16);

Serial.println("printf");
Serial.printf("%18.16lf\n",3.14159265358979323846264338L);
Serial.printf("%18.16lf\n",3.14159265358979323846264338);
}

void loop()
{
}``````

This code runs correctly with Arduino versions 1.0.6, 1.6.1 and 1.6.4

Anyone run into this?
Am I forgetting a magic incantation or is something up the creek?

Pete

Last edited:
Addendum: here's the output I get from two different versions of Arduino:
Code:
``````         1.6.5_r2                          1.6.4
Should be                         Should be
3.14159265358979323846264338      3.14159265358979323846264338
3.1415927410125732                3.1415926535897931
3.1415926535897931                3.1415926535897931
3.1415927410125732                3.1415926535897931
printf                            printf
3.1415926535897931                3.1415926535897931
3.1415927410125732                3.1415926535897931``````

Pete

Yup, this was a change in 1.6.5. Float constants now default to only single precision. If you want double, you have to add "L".

I know this change is somewhat controversial.

The problem with defaulting to double precision is just one float constant in the middle of an equation causes the whole thing to be promoted to slow double precision calculation, even if the final result is stored in a single precision float. The Arduino world is filled with such code, since AVR only supports single precision.

Soon we'll have a Teensy with a single precision FPU, but double precision will still use the same very slow software routines. Making more (most) floating point code use the FPU by default weighed heavily on this decision.

Of course, the downside is you have to add the "L" to constants if you really need double precision. Or you can edit boards.txt to delete "-fsingle-precision-constant" from the compiler flags.

Thanks very much guys. I can see that it would be a controversial change. My books on C all say that a floating point constant defaults to double precision.
I have added my own versions of M_PI etc. to use the "L" suffix which forces them all to be double precision but perhaps I also need to look at making a single precision version of my code when that Teensy with single precision FPU hits the market.
But there's another downside: all of this will have to be changed back to double precision when 64-bit FPUs are cheap and plentiful - probably in about two years

Thanks
Pete

Well technically, the 'L' suffix converts the constant to long double, not to double. In the Arm environment, both types use the same encoding. In Intel and PowerPC environments, long double has different encodings than double.

If you are curious, the default long double for Intel is the 80-bit floating point format that the 80287 and successors have defined, and there is an option to use the IEEE 754R standard 128-bit encoding. The PowerPC currently uses IBM extended double (a pair of double values to give more precision), and as part of my day job, I am adding support for IEEE 754R 128-bit floating point.

Making FP constants single precision prevents the compiler from automatically converting an expression from single precision to double. As you note, this will be helpful in the future when the Teensy 3++ has single precision floating point, but not double precision. However, it has the downside that you lose precision in the constants. In code ported over from Arduino AVR systems, users wouldn't notice because their double is the same representation as float. So it is a trade-off with no answer that wins in all situations. I think I wondered out loud at the time when Paul made the choice whether it might affect somebody and it looks like it does to you.

I first noticed the problem with a test in which my code evaluates the integral of sin(x^2) from zero to pi using 100 intervals (Simpsons method). Internally, my code converts "pi" to M_PI. In versions previous to 1.6.5, M_PI is converted correctly to a double and the result of the integral is 0.7726530189891112 which agrees with a couple of online calculators. But when the upper limit is correct to only 7 decimal places the result is 0.7726529813709952.
I seem to be the only one who has run into this. The main problems I see with it are:
- every book on the C/C++ language says that unqualified floating point constants are double precision.
- there are no warnings from the compiler to indicate that it is doing something "unusual".

Just a suggestion to ponder: can this behaviour be controlled by a new option in the Tool menu which chooses either float or double constants?
I realize that I'm probably the only one to run into this so far, and I am just "playing" rather than creating a product, but I suspect that this will come back and bite sometime in the future.

Pete
P.S. the speed of software double precision isn't really an issue for me in this application. My calculator takes 23 seconds to do that integration. Teensy3.1 takes 7ms!

can this behaviour be controlled by a new option in the Tool menu which chooses either float or double constants?

Yes. In fact, it's possible to add stuff to the menus just by editing boards.txt.

But just because you can do a thing doesn't mean you should do it. Already we've got more menu-configured options than other Arduino compatible boards. Eventually more things are likely to get added to the menu. My gut feeling is relatively minor settings really shouldn't compete for attention in a GUI used for configuring major high-level functionality.

Put the following in your source before the functions:

Code:
``#pragma GCC optimize("no-single-precision-constant")``

Oh, now that's really useful! I had no idea gcc offered this pragma.

Michael: I tried adding that at the top of the code in my first message and it still prints incorrect results.

Pete

Oh, now that's really useful! I had no idea gcc offered this pragma.

Thanks. I added this to GCC about 7-8 years ago when I was at AMD to allow adding functions compiled with target specific optimizations. The idea was provide modules that are compiled for say AMD K8, Intel Haswell, as well as generic. The last part of the original spec was not added, and the ifunc attribute provides the functionality of choosing at program startup what function to run (this needs the ELF shared library facility, and so probably is not usable in the Teensy environment).

Over the years, there has been some grumbling that allowing the finer grained options (like -fno-single-precision-constant) breaks things, particularly as more of the compilation is now done at the whole object file level rather than the module level that was the model when I first came up with it. Originally, it was just supposed to be target options (the -m options), that so far only x86, powerpc, and nios support. But the embedded folks asked for a way to declare functions hot/cold, and -O3 vs. -Os optimizations, and I kind of got carried away.

You can either declare it as a #pragma, or as a function attribute.

However, as el_supremo ran into, it seems to only work with C and not C++. So, it isn't that useful in the Arduino environment.

Last edited:
Michael: I tried adding that at the top of the code in my first message and it still prints incorrect results.

Pete
Hmmm, in testing it further, it looks like it isn't implemented in C++, just C. Unfortunately, that makes is nearly useless for the Arduino environment.

Last edited:
But there's another downside: all of this will have to be changed back to double precision when 64-bit FPUs are cheap and plentiful - probably in about two years
The Arduino-guys will call it "triple" - no problem

Hmmm, in testing it further, it looks like it isn't implemented in C++, just C. Unfortunately, that makes is nearly useless for the Arduino environment.

Yep, you remember our discussion regarding pragmas in arduino ?
None of them work correctly.
I opened a BUG-report & bugzilla regarding that problem, it was merged with another, (years-)older on.
It's unlikely that they fix it in the next months... or years...

Perhaps it works with "extern c { #pragma...." ? (just an idea now, can't say if that would work)

Soon we'll have a Teensy with a single precision FPU, but double precision will still use the same very slow software routines. Making more (most) floating point code use the FPU by default weighed heavily on this decision.

It there any sort of a timeline on this? An FPU might be just exactly what the doctor ordered.

Last edited:
The Arduino-guys will call it "triple" - no problem

LOL. I have been trying to avoid that very issue in my own code by never using anything other than uint8_t, int32_t, etc. It's way too easy to make a mistake that bites you in the butt as soon as you change processor architectures, especially when writing libraries. It's interesting how floats, doubles, etc. are not defined with the same precision as integers in terms of their accuracy/resolution/etc/.

As for doubles, they are now implemented properly on the Teensy 3.x/LC series. They sure gobble up a lot of RAM and presumably are slow as molasses to process!

Last edited:
Type int is of course undefined in bit size.
Common practice is to always use the types in stdint.h.

Coders vs. engineers

The decimal keyword denotes a 128-bit data type. Compared to floating-point types, the decimal type has a greater precision and a smaller range, which makes it suitable for financial and monetary calculations. Precision is the main difference where double is a double precision (64 bit) floating point data type and decimal is a 128-bit floating point data type.

Double - 64 bit (15-16 digits)

Decimal - 128 bit (28-29 significant digits)

So Decimals have much higher precision and are usually used within monetary (financial) applications that require a high degree of accuracy. But in performance wise Decimals are slower than double and float types. Double Types are probably the most normally used data type for real values, except handling money. In general, the double type is going to offer at least as great precision and definitely greater speed for arbitrary real numbers. More about...Double vs Decimal

Johnson

The decimal keyword denotes a 128-bit data type. Compared to floating-point types, the decimal type has a greater precision and a smaller range, which makes it suitable for financial and monetary calculations. Precision is the main difference where double is a double precision (64 bit) floating point data type and decimal is a 128-bit floating point data type.

Double - 64 bit (15-16 digits)

Decimal - 128 bit (28-29 significant digits)

So Decimals have much higher precision and are usually used within monetary (financial) applications that require a high degree of accuracy. But in performance wise Decimals are slower than double and float types. Double Types are probably the most normally used data type for real values, except handling money. In general, the double type is going to offer at least as great precision and definitely greater speed for arbitrary real numbers. More about...Double vs Decimal

Johnson

If you were programming in C (not C++ that is used by the Arduino layer), on some platforms there is the __Decimal32, __Decimal64, and __Decimal128 types that are 32, 64, or 128-bits wide. I believe x86_64, powerpc (server only), and s390 processors are the only ones to enable decimal support (i.e. arm does not enable it). Note x86_64 use a different format for how the decimal values are laid out compared to powerpc/s390. Current powerpc servers and s390 have decimal support in hardware.

I'm not as up on the decimal support in C++, but I believe C++ has rejected the C types, and instead does this via a library. I would suspect this library may not be supported in embedded systems like the Teensy uses.

For Teensy 3.5/3.6, you want to use float if possible and not double (which is why the option is used to make constants single precision). This is due to float being implemented in hardware, and double being implemented in software.

Status
Not open for further replies.