Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 12 of 12

Thread: double literals need to be marked as long double?

  1. #1
    Member
    Join Date
    Mar 2017
    Location
    Oakland, CA, USA
    Posts
    35

    double literals need to be marked as long double?

    Why does the following code output 3ff19999a0000000 instead of 3ff199999999999a:
    Code:
    void setup() {
      double d = 1.1;
      uint64_t u;
      memcpy(&u, &d, 8);
      Serial.printf("%08x%08x\n",
          static_cast<uint32_t>(u >> 32), static_cast<uint32_t>(u));
    }
    When I change the literal to 1.1L, a long double, it prints the desired result. I thought that the double type on the Teensy was 64 bits?

  2. #2
    Senior Member+ MichaelMeissner's Avatar
    Join Date
    Nov 2012
    Location
    Ayer Massachussetts
    Posts
    2,718
    The Teensy 3.5 and 3.6 only support single precision floating point in hardware. So as a compromise, Paul turns on the option -fsingle-precision-constant which forces constants with an explicit suffix to be single precision. This way, the expression:

    Code:
    float f;
    
    // ...
    
    f += 1.0;
    is done in single precision. Without the option, f would be converted to double, and then the __adddf3 function would be called to emulate double precision, and then the value would be converted back to single precision.

    C's rules about constants being double harken back to the initial machines Dennis Ritchie developed C on (PDP-7 and then mostly PDP-11). On the models that they had in Bell Labs, double precision was the default, and you had to switch the CPU to single precision if you wanted to do 32-bit calculations. In the same vein, until Intel came out with the SSE instruction set, all floating point calculations were done in 80-bit mode. So the default arithmetic is double, and constants defaulted to that as well. It wasn't until the first ANSI (later ISO) C standard that single precision arithmetic did not automatically promote to double.

    But on machines that support both float and double and double is much slower than float, it can be a problem unless people use the 'F' suffix.

    While C's rules are somewhat arbitrary, it is at least better than PL/1, in which constants with 6 or fewer digits were converted to single precision and 7 or more were converted to double precision. When I wrote the Data General C compiler in PL/1 some 30 years, I remember having to specify things as 0.100000000 to get the proper constant.

    Fortunately, the arm 32-bit compiler does not support IEEE 128-bit floating point, and long double is the same as double. So when you declare a constant with the 'L' it doesn't change it to a different representation. I've been working on/off with IEEE 128-bit floating point for the last few years (I work for IBM on the PowerPC GCC compiler, and the soon to be shipping power9 has IEEE 128-bit floating point hardware).
    Last edited by MichaelMeissner; 11-10-2017 at 06:08 AM.

  3. #3
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    17,624
    Yeah, I wish there was a better solution, but this seems to be the best trade-off.

    The Arduino world is filled with code written on AVR, where double is implemented as 32 bit float. Almost nobody types "1.0f", they just put "1.0" into their code because they know all float math is 32 bits.

    When run on a 32 bit ARM chip, especially one with a FPU that makes 32 bit float about the same speed as integers, the penalty for automatically promoting almost all expressions to slow 64 bit double done in software is just too much.

    Also on my to-do list (but a low priority) is someday switching some of the common C library math functions like sin(), log() to C++ templates so we can automatically do fast 32 bit float when we know the input and output are only 32 bits.

  4. #4
    Senior Member+ Theremingenieur's Avatar
    Join Date
    Feb 2014
    Location
    Colmar, France
    Posts
    1,626
    Quote Originally Posted by PaulStoffregen View Post
    Also on my to-do list (but a low priority) is someday switching some of the common C library math functions like sin(), log() to C++ templates so we can automatically do fast 32 bit float when we know the input and output are only 32 bits.
    Isn't it sufficient to simply use sinf(), logf(), and so on in the code?

  5. #5
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    17,624
    Yes. But what to do about the huge amount of existing, poorly designed code in hundreds of Arduino sketches and libraries?

  6. #6
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    4,438
    What are your plans with Teensy 4 in this regard (literals)? I think the chip has a double precision FPU -correct?

  7. #7
    Member
    Join Date
    Mar 2017
    Location
    Oakland, CA, USA
    Posts
    35
    So to confirm, there's no way to have a 64-bit double literal when compiling for the Teensy, other that adding the 'L' suffix and turning them into long doubles? This approach gives some libraries a hard time, for example ArduinoUnit, where there's no assert overloaded call having a long double.

  8. #8
    Member
    Join Date
    Mar 2017
    Location
    Oakland, CA, USA
    Posts
    35
    In PlatformIO, I just added the -fno-single-precision-constant build option. I'm going to hope that all the Teensyduino code is reasonably well written to use float literals with the 'f' suffix where necessary.

  9. #9
    Senior Member+ Theremingenieur's Avatar
    Join Date
    Feb 2014
    Location
    Colmar, France
    Posts
    1,626
    Quote Originally Posted by shawn View Post
    In PlatformIO, I just added the -fno-single-precision-constant build option. I'm going to hope that all the Teensyduino code is reasonably well written to use float literals with the 'f' suffix where necessary.
    If not, then you are cordially invited to fix everything and to do pull requests on GitHub

  10. #10
    Member
    Join Date
    Mar 2017
    Location
    Oakland, CA, USA
    Posts
    35
    I'm writing a CBOR (compact binary object representation, RFC 7049) library and I encountered this when writing tests.
    Last edited by shawn; 11-11-2017 at 03:48 AM.

  11. #11
    Senior Member+ MichaelMeissner's Avatar
    Join Date
    Nov 2012
    Location
    Ayer Massachussetts
    Posts
    2,718
    Note, you can always put in a cast:

    Code:
    #define POINT_ONE ((double) 0.1L)
    Or use const:

    Code:
    const double POINT_ONE = 0.1L;
    But, I assume you know that.

  12. #12
    Member
    Join Date
    Mar 2017
    Location
    Oakland, CA, USA
    Posts
    35
    I hadn't thought of the casting-a-long-double-to-a-double approach. Thanks for the tip.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •