Double divide corrupting on the Teensy 3.1?

ipaq3115 · May 13, 2014

I'm converting a set of long GPS coordinates to doubles so that I can manipulate them. Everything seems to be working as it should except that there is an error or something when dividing the coordinate down so that the decimal lines up.

Here's the function I'm playing with

I split it up to divide by 10 at a time when I found out it wasn't converting the number correctly.

Code:

double longDegreeToDouble(long degree) {

	double divider = 10.00000000000000;
	double degreeD = degree;
	
	//if(D) { db << pstr("degreeD "); printFloat(degreeD,15); db << endl; }
	if(D) db << pstr("degreeD ") << dec << setprecision(15) << degreeD << ' ' << dec << setprecision(15) << divider << endl;
	
	for(int i=0;i<7;i++) {
	
		degreeD /= divider;
		
		//if(D) { db << pstr("degreeD "); printFloat(degreeD,15); db << endl; }
		if(D) db << dec << i << pstr("degreeD ") << dec << setprecision(15) << degreeD << ' ' << dec << setprecision(15) << divider << endl;
	
	}

	return degreeD;

}

Here's an example of a latitude I'm putting into the function

-776591120

... and this is what it's printing

Code:

degreeD -776591120.000000000000000 10.000000000000000
0degreeD -77659112.000000000000000 10.000000000000000
1degreeD -7765911.200000000186264 10.000000000000000
2degreeD -776591.119999999995343 10.000000000000000
3degreeD -77659.111999999993713 10.000000000000000
4degreeD -7765.911199999999553 10.000000000000000
5degreeD -776.591119999999932 10.000000000000000
6degreeD -77.659111999999993 10.000000000000000

That third line is the problem. No idea where those numbers at the end are coming from and it throws everything else off. Is it possible there is data being corrupted in the double somehow?

robsoles · May 13, 2014

There is an explanation for this in the first paragraph of http://arduino.cc/en/Reference/Float.

Maybe something like the following can work out acceptably enough for you

Code:

long inDegree=<read_value>/1000;
float outDegree=(float)inDegree/10000.0;

Of course this just ditches the bottom three digits and if they are significant enough (usually are) then it may be better to gear all code in your project to keep them in the range they are being read in and deal with that as best as possible for your purposes.

If you need the floating point value made by <read_value>/10000000 for something Teensy side then you may have to wear the floating point error there, if not and you are outputting values to a partner program on the PC then you could send the long value and divide it where floating point (double springs to mind) has enough significant digits to do it 'clean' at this value range.

stevech · May 13, 2014

Teensy 3.1 type float is not same as type double (8 bytes)
Arduino type double is an alias and same as type float (4 bytes)

converting from float to double can have errors, right?

robsoles · May 13, 2014

if Teensy 3.x is using 8 bytes to store 'double' variables then ipaq3115 could benefit a great deal by ditching float in favor of double, to be honest I'm accustomed to the limitations of the 4 bytes floating point type and I just haven't checked the appropriate file(s) to know if Teensy 3.x does use 8 bytes for double type variables.

If it does use 8 bytes it has to have greater range of significant digits both sides of the decimal point.

I'm willing to assume that if the value stored in the 4 byte float variable hasn't more digits than can be represented adequately by it then that value shouldn't be degraded for being transferred to an 8 byte double precision variable. Perhaps I should either ask "what do you mean by convert?" or perhaps I should have kept my lip zipped from the first opportunity throughout

robsoles · May 13, 2014

From ./hardware/teensy/cores/teensy3/arm_math.h

Code:

  /**
   * @brief 64-bit floating-point type definition.
   */
  typedef double float64_t;

definitely worth trying where float fails us more dramatically than we want to tolerate, thanks very much stevech.

Might be worth mentioning that integer/integer is faster than float/*anything* is (well, logically should be) faster than float64_t/*anything* but that is probably negligible trade-off for accuracy sake.

stevech · May 13, 2014

Microprocessors. Floating point. Avoid wherever possible.
Scale up by x, to get a fixed point integer. If speed is important.

There's also type long long, 64 bit integers. Or 64 bit fixed point with, say, for GPS, implicit 8 bits to left of implied decimal point in 64 bits, and 56 fractional bits. An extreme example.

Only if you need speed.

I've done a lot of GPS work and never needed all that precision. GPS per se isn't that precise.

robsoles · May 13, 2014

Oh, epic fail on my part - ipaq3115's code in OP is already using double

Sorry, even for accuracy sake an integer method is probably more desirable.

xxxajk · May 13, 2014

Mantissa (significand) errors are normal.
They happen because the ability to represent the precision in binary has limitations.
These limits even affect where what you type in your source code, isn't always seen and stored by the computer as you might expect!

What follows are the important points from Wikipedia. The URL for the whole story is http://en.wikipedia.org/wiki/Floating_point Note the words "usually" and how there is no guarantee even on how many bytes are used between platforms without additional help, such as using a bignum library.

Single precision, usually used to represent the "float" type in the C language family (though this is not guaranteed). This is a binary format that occupies 32 bits (4 bytes) and its significand has a precision of 24 bits (about 7 decimal digits).

Double precision, usually used to represent the "double" type in the C language family (though this is not guaranteed). This is a binary format that occupies 64 bits (8 bytes) and its significand has a precision of 53 bits (about 16 decimal digits).

The fact that floating-point numbers cannot precisely represent all real numbers, and that floating-point operations cannot precisely represent true arithmetic operations, leads to many surprising situations. This is related to the finite precision with which computers generally represent numbers.

For example, the non-representability of 0.1 and 0.01 (in binary) means that the result of attempting to square 0.1 is neither 0.01 nor the representable number closest to it. In 24-bit (single precision) representation, 0.1 (decimal) was given previously as e = −4; s = 110011001100110011001101, which is

0.100000001490116119384765625 exactly.

Squaring this number gives

0.010000000298023226097399174250313080847263336181640625 exactly.

Squaring it with single-precision floating-point hardware (with rounding) gives

0.010000000707805156707763671875 exactly.

But the representable number closest to 0.01 is

0.009999999776482582092285156250 exactly.

Jp3141 · May 13, 2014

double with 64 bits (8 bytes) has about 16 decimal digits of precision. Your results are accurate to that level. Remember in binary, numbers like 0.2 are not precisely representable (they're similar to 0.3333333 in a decimal representation), so when your input happens to have a fractional part of 0.2, it can't be represented exactly in the bits available -- those bits (when converted back to a decimal representation) will not represent 0.2 precisely.

You can see in the results below (similar to yours) that they are accurate to 16 bits.

Code:

#define STOP {Serial.println("END");delay(1000); Serial.end(); while (1);}
double integerPart;
int i;

void setup() {
  while (!Serial);
  longDegreeToDouble(-776591120);
  longDegreeToDouble(-776591200);
  STOP
}

void loop() {}

double longDegreeToDouble(long degree) {
  double divider = 10.;
  double degreeD = degree;
  for(i = 0; i < 9; i++) {
    degreeD /= divider;
     Serial.printf("%2i degreeD %-27.16f %27.16f %22.15f\n", i, degreeD, divider, modf(degreeD, &integerPart));
   }
   for(   ; i < 18; i++) {
     degreeD *= divider;
     Serial.printf("%2i degreeD %-27.16f %27.16f %22.15f\n", i, degreeD, divider, modf(degreeD, &integerPart));
    }
return degreeD;
}

Code:

 0 degreeD -77659112.0000000000000000          10.0000000000000000     -0.000000000000000
 1 degreeD -7765911.2000000001862645           10.0000000000000000     -0.200000000186265
 2 degreeD -776591.1199999999953434            10.0000000000000000     -0.119999999995343
 3 degreeD -77659.1119999999937136             10.0000000000000000     -0.111999999993714
 4 degreeD -7765.9111999999995533              10.0000000000000000     -0.911199999999553
 5 degreeD -776.5911199999999326               10.0000000000000000     -0.591119999999933
 6 degreeD -77.6591119999999933                10.0000000000000000     -0.659111999999993
 7 degreeD -7.7659111999999997                 10.0000000000000000     -0.765911200000000
 8 degreeD -0.7765911200000000                 10.0000000000000000     -0.776591120000000
 9 degreeD -7.7659111999999997                 10.0000000000000000     -0.765911200000000
10 degreeD -77.6591119999999933                10.0000000000000000     -0.659111999999993
11 degreeD -776.5911199999999326               10.0000000000000000     -0.591119999999933
12 degreeD -7765.9111999999995533              10.0000000000000000     -0.911199999999553
13 degreeD -77659.1119999999937136             10.0000000000000000     -0.111999999993714
14 degreeD -776591.1199999998789281            10.0000000000000000     -0.119999999878928
15 degreeD -7765911.1999999992549419           10.0000000000000000     -0.199999999254942
16 degreeD -77659112.0000000000000000          10.0000000000000000     -0.000000000000000
17 degreeD -776591120.0000000000000000         10.0000000000000000     -0.000000000000000
 0 degreeD -77659120.0000000000000000          10.0000000000000000     -0.000000000000000
 1 degreeD -7765912.0000000000000000           10.0000000000000000     -0.000000000000000
 2 degreeD -776591.1999999999534339            10.0000000000000000     -0.199999999953434
 3 degreeD -77659.1199999999953434             10.0000000000000000     -0.119999999995343
 4 degreeD -7765.9119999999993524              10.0000000000000000     -0.911999999999352
 5 degreeD -776.5911999999999580               10.0000000000000000     -0.591199999999958
 6 degreeD -77.6591200000000015                10.0000000000000000     -0.659120000000001
 7 degreeD -7.7659120000000001                 10.0000000000000000     -0.765912000000000
 8 degreeD -0.7765912000000000                 10.0000000000000000     -0.776591200000000
 9 degreeD -7.7659120000000001                 10.0000000000000000     -0.765912000000000
10 degreeD -77.6591200000000015                10.0000000000000000     -0.659120000000001
11 degreeD -776.5912000000000717               10.0000000000000000     -0.591200000000072
12 degreeD -7765.9120000000002619              10.0000000000000000     -0.912000000000262
13 degreeD -77659.1199999999953434             10.0000000000000000     -0.119999999995343
14 degreeD -776591.1999999999534339            10.0000000000000000     -0.199999999953434
15 degreeD -7765912.0000000000000000           10.0000000000000000     -0.000000000000000
16 degreeD -77659120.0000000000000000          10.0000000000000000     -0.000000000000000
17 degreeD -776591200.0000000000000000         10.0000000000000000     -0.000000000000000
END

ipaq3115 · May 14, 2014

I'm still not sure I understand. I get that there would be different fractions that wouldn't display precisely in a different base, but I don't understand why fractions make a difference, is it something to do with the divide operation that's going on? Using different math that takes the decimal into account? The way I understood a float or double worked is that it is just a normal integer with a separate number that keeps track of where the decimal is within that number. Representing it, I thought, is just figuring out where to print the decimal.

Sounds like going back to fixed point math and a long long type might be better for what I'm doing if I'm not going to run into the same fractional problems that a double has.

I was originally using a long type for everything before but I made the switch to double so that I could take the size of my map in pixels and the latitudes and longitudes of the edges of the map and create a scale that I could use to get pixel coordinates for lat/lon locations on the map for overlay purposes. I couldn't do that with any form of 32bit number because there wasn't enough precision. I was doing it all seperately and hardcoding map scale. I didn't know about the long long until just now. Guess I should have looked around a bit more before I jumped into the world that is doubles

el_supremo · May 14, 2014

The problem isn't in the division per se, it is in the number that the division created.
The number 2 is exactly representable in binary. Divide it by 10 to get 0.2 and you've got a number which can't be exactly represented with a finite number of binary bits because in binary it is 0.0011001100110011001100110011.........etc ad nauseam.

The way I understood a float or double worked is that it is just a normal integer with a separate number that keeps track of where the decimal is within that number.

That is more or less what it does. But with a number like 0.2 you can't fit it into a convenient integer. You have to drop some bits off the end so that the representation isn't exact.

Pete

ipaq3115 · May 14, 2014

Yeah, I see how 0.2 is a problem but since there are 16 digits of resolution I'm thinking 0.2 really looks more like 200000000000000 with an exponent of 15 or wherever the computer decides that two ends up in the integer. Doesn't seem to me like you'd have a real fraction until that 2 got pushed out of the precision capability of the double at which point you don't care anyway.

I guess the question is, why is there a need for a binary number with a decimal point like you showed el_supremo? Are there math operations that need the number to be represented that way?

el_supremo · May 14, 2014

I'm thinking 0.2 really looks more like 200000000000000 with an exponent of 15

The problem is that you're thinking in decimal. The binary representation doesn't work like that. It has to be a binary exponent as well as a binary mantissa. i.e. the exponent is a power of two - not ten.

Pete

Jp3141 · May 14, 2014

No, a float is not separated into integer and fraction portions. In fact it is normalized to a number between 1.000 and 1.99999... (actually 1-2^-53 I think) and an exponent representing a power of 2 (from -1023 to +1023 or so.). So decimal 1.0 is stored as 1.0 exponent of 2^0, decimal 2.0 is stored as 1.0.. exponent 2^1 etc. 10 would be stored as 1.25 exponent 2^3. 100 would be 1.563... exp 2^6 etc. You can see that the pattern doesn't necessarily follow a decimal convention, and even the integers are not necessarily precisely represented.

This works great for binary computers, but doesn't follow user's intuition sometimes. Calculators (and banks ?) actually often use BCD math to make the numbers follow your intuition, including allowing all integers to be precisely represented. BDC = binary-coded decimal.

stevech · May 14, 2014

The infamous novice error is to compare for equality between two floating point numbers that are the result of some computations.
float f1, f2;
f1 = equation...
f2 = equation...

if (f1 == f2) // oops
if (f1 != f2) // oops

xxxajk · May 14, 2014

Simple answer (as already stated, and is also what I do) is to use integers and for the decimal part, use the % operator.

johnsonjeven · May 12, 2016

Decimal vs Double vs Float

Precision is the main difference where double is a double precision (64 bit) floating point data type and decimal is a 128-bit floating point data type.

Double - 64 bit (15-16 digits)
Decimal - 128 bit (28-29 significant digits)

So Decimals have much higher precision and are usually used within monetary (financial) applications that require a high degree of accuracy. But in performance wise Decimals are slower than double and float types. Double Types are probably the most normally used data type for real values, except handling money. More about......Decimal vs Double vs Float

Johnson

MichaelMeissner · May 12, 2016

First of all, decimal types (_Decimal32, _Decimal64, and _Decimal128) are not supported by either the Teensy ARM GCC compiler, nor the AVR GCC compiler. As far as I know, decimal types are only supported by default for the Intel/AMD i386/x86_64 platforms and the IBM System Z/PowerPC platforms. On the Intel/AMD platforms, all of the decimal support is done in software, while both the System Z and PowerPC platforms support decimal in hardware.

Second in terms of digits, _Decimal32 gives you 7 digits of precision, _Decimal64 gives you 16 digits of precision, and _Decimal128 gives you 34 digits of precision. Binary 32-bit floating point (float) gives you 7-8 decimal digits, binary 64-bit floating point (double) gives you 15-16 digits of precision, and if you have IEEE 128-bit binary floating point (__float128), it gives you roughly 34 digits of precision. So it depends on what decimal type you use whether it has more digits of precision. Using the same size as the binary floating point, the decimal type may give you 1 more decimal digit of precision, since binary floating point doesn't give exact decimal digits.

In the PowerPC, the hardware support for decimal floating point is slower than the hardware support for binary floating point types, but it is much faster than doing it via software emulation.

Unfortunately, the two camps (Intel/AMD vs. IBM) could not agree on the format used for decimal types, so IEEE-754R allows both as interchange formats.

Note, only the Intel/AMD and PowerPC compilers give you access to __float128. The Intel/AMD support is done completely in software. I wrote the support for the PowerPC compiler and it first went in GCC 6.1 that was just released. I don't expect to have it fully supported until GCC 7.1 comes out (since now we need to go through the libraries to enable the support there). On the PowerPC side, it is currently done in software, but future machines supporting ISA 3.0 will have hardware support.

In terms of precision, I once worked on a compiler (DG/L) for Data General, that allowed you to do arithmetic on strings, and you could do arbitrary precision calculations. It would do the calculation, much the same way most humans do it, one decimal digit at a time.

PaulStoffregen · May 12, 2016

johnsonjeven said:
Double - 64 bit (15-16 digits)
Decimal - 128 bit (28-29 significant digits)
.....
So Decimals have much higher precision and are usually used within monetary (financial) applications that require a high degree of accuracy.

15 digits is can represent the entire world's annual gross domestic product to the nearest penny.

25 digits ought to be enough to represent cumulative worldwide GDP until the sun becomes a red giant (destroying Earth & the other planets).

adrian · May 12, 2016

@PaulStoffregen ... you underestimate the power of cosmic inflation

bicycleguy · Sep 30, 2016

I' getting the same precision for double as float on Teensy3.2.

Please enlighten me on my error.

Code:

const char *theversion="float_vs_double_test, Sep 30, 2016\n";
float thefloat;
double thedouble;

void setup() {
  Serial.begin(9600);
  while (!Serial) {
    ; // wait for serial port to connect. Needed for native USB port only
  }

  Serial.println(theversion);
  Serial.println("This micro:");
  Serial.print("sizeof(int) = ");Serial.println(sizeof(int));
  Serial.print("sizeof(int32_t) = ");Serial.println(sizeof(int32_t));
  Serial.print("sizeof(long) = ");Serial.println(sizeof(long));
  Serial.print("sizeof(float) = ");Serial.println(sizeof(float));
  Serial.print("sizeof(double) = ");Serial.println(sizeof(double));

  thefloat=PI;
  thedouble=PI;
  float thefpi=3.1415926535897932;
  double thedpi=3.1415926535897932;
  Serial.println("\ndigits\t\t1.234567 32 bit is good to 7 digits");
  Serial.println(      "\t\t3.14159265 is the first 9 exactly");
  Serial.print("PI float\t");Serial.print(thefloat,20);Serial.println(" is Arduino #define PI which is ~7 digits");
  Serial.print("my float\t");Serial.print(thefpi,20);Serial.println(" is my float");
  
  Serial.println("\ndigits\t\t1.234567890123456 64 bit is good to 16 digits");
  Serial.println(      "  \t\t3.1415926535897932 is the first 17 exactly");
  Serial.print("PI double\t");Serial.print(thedouble, 20);Serial.println(" is Arduino PI which is still only 7 digits?");
  Serial.print("my double\t");Serial.print(thedpi, 20);Serial.println(" is my double");
  Serial.printf("printf\t\t%-20.16f Arduino PI using printf", thedouble);
}

void loop() {
  // put your main code here, to run repeatedly:

}

Results

Code:

float_vs_double_test, Sep 30, 2016

This micro:
sizeof(int) = 4
sizeof(int32_t) = 4
sizeof(long) = 4
sizeof(float) = 4
sizeof(double) = 8

digits		1.234567 32 bit is good to 7 digits
		3.14159265 is the first 9 exactly
PI float	3.141592741012573 is Arduino #define PI which is ~7 digits
my float	3.141592741012573 is my float

digits		1.234567890123456 64 bit is good to 16 digits
  		3.1415926535897932 is the first 17 exactly
PI double	3.141592741012573 is Arduino PI which is still only 7 digits?
my double	3.141592741012573 is my double
printf		3.1415927410125732   Arduino PI using printf

Below is the Due output, note the higher precision for the same byte size double:

Code:

float_vs_double_test, Sep 30, 2016

This micro:
sizeof(int) = 4
sizeof(int32_t) = 4
sizeof(long) = 4
sizeof(float) = 4
sizeof(double) = 8

digits		1.234567 32 bit is good to 7 digits
		3.14159265 is the first 9 exactly
PI float	3.14159274101257324218 is Arduino #define PI which is ~7 digits
my float	3.14159274101257324218 is my float

digits		1.234567890123456 64 bit is good to 16 digits
  		3.1415926535897932 is the first 17 exactly
PI double	3.14159265358979311599 is Arduino PI which on Due is ~16 digits
my double	3.14159265358979311599 is my double

MichaelMeissner · Sep 30, 2016

What you are seeing is there is an option Paul puts on the compiler that says all floating point constants are to be treated as single precision (-fsingle-precision-constant). The reason is the new 3.5/3.6 now have hardware single precision floating point, but double precision is still done in software emulation. If you are porting Arduino code where people inter-mix float and double, the double precision constants will force the whole expression to be done in double precision.

It's a trade-off Paul made to speed up the majority of programs, but unfortunately it can burn people that really want 8 byte floating point. If you code the constant as:

Code:

double thedpi=(double)3.1415926535897932L;

It should give you the full 8 byte constant. What it does is convert the constant as long double, and then convert it to double. Now, in the current ARM environment, long double is the same as double, but you don't want to depend on that. So you put in an explicit cast to get it back to double.

Ultimately the idea that constants are double comes from the first implementations of C on the PDP-11. IIRC, many of the 11's that had floating point in hardware or firmware only did double precision, and to do single precision, you had to change a mode to single precision, do the calculation, store it, and then change back to the normal mode, so it was much slower to do single precision.

bicycleguy · Sep 30, 2016

Thanks for the quick answer. I've been pulling my hair out for days. Thought my Alzhiemers was in play.

Not sure I agree with what Paul has done on this one. As the example shows my double is still taking up 8 bytes. The majority of programs should run slow rather than incorrectly. This affects beginners trying to understand data structures too.

Oh well, good thing my 3.5 and 3.6 arrived today. So much for the last wasted week of trying to get ready for it with incorrect speed tests all over the place.

bicycleguy · Sep 30, 2016

MichaelMeissner said:
What you are seeing is there is an option Paul puts on the compiler that says all floating point constants are to be treated as single precision (-fsingle-precision-constant). The reason is the new 3.5/3.6 now have hardware single precision floating point, but double precision is still done in software emulation.

Sorry for the rant. Just trying to understand. This brings up some questions:

1. Before the 3.5,3.6 was conceived, was there a version where double was handled as 64 bit?
2. On 3.2 hardware are doubles stored as 4 byte floats in 8 bytes of ram with the rest wasted?
3. So I can't use any standard ansi c code from anyware with doubles to work correctly with the teensy?
4. Is there a new K & R C Programing Language for the teensy? (sorry I'm starting to get excited again)
5. Is there a single header file or something I can change to make this work as it should?

MichaelMeissner · Sep 30, 2016

bicycleguy said:
Sorry for the rant. Just trying to understand. This brings up some questions:

1. Before the 3.5,3.6 was conceived, was there a version where double was handled as 64 bit?
2. On 3.2 hardware are doubles stored as 4 byte floats in 8 bytes of ram with the rest wasted?
3. So I can't use any standard ansi c code from anyware with doubles to work correctly with the teensy?
4. Is there a new K & R C Programing Language for the teensy? (sorry I'm starting to get excited again)
5. Is there a single header file or something I can change to make this work as it should?

No, double variables still have all 8 bytes of precision. It is only double constants that are converted to single precision. That is why I suggested adding a 'L' suffix to your constant, which makes it a long double, and then converting it back to double.

You can change the file hardware/teensy/avr/boards.txt to remove the -fsingle-precision-constant option, either in all of the instances, or make a new option (much like the various speed options) that does not have it. You will need to make this modification each time you install Teensydunio.

Here is the thread where Paul announced the change last year: https://forum.pjrc.com/threads/3111...constants?highlight=single+precision+constant. And notice my reply in #6 where I said it would cause problems in the few cases where people want all of the precision. However, ignore my reply #9, since #pragma GCC optimize only works for the C language and not for C++ (.ino/.pde files are converted to C++).

Perhaps when Paul gets back from the NY maker faire next week, you can convince him to make an option for this. Perhaps you will just need to make your own boards.txt and change it for every release. Or switch to using long double and adding 'L' to the constants.

Double divide corrupting on the Teensy 3.1?

Active member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Active member

Well-known member

Active member

Well-known member

Well-known member

Well-known member

Well-known member

New member

Senior Member+

Well-known member

Well-known member

Well-known member

Senior Member+

Well-known member

Well-known member

Senior Member+