Does Teensy4.0 deal with double dataset?
For this link https://www.pjrc.com/store/teensy41.html
says: "Float point math unit, 64 & 32 bits"
Does this mean that it only deals with float datasets ?
Does Teensy4.0 deal with double dataset?
For this link https://www.pjrc.com/store/teensy41.html
says: "Float point math unit, 64 & 32 bits"
Does this mean that it only deals with float datasets ?
64Bit is double.
Yes, thanks, I was wondering because I have a code that when I use double datasets and run it on Teensy, it gives different results than in ESP32 for the same parameters and dataset.
The Teensy 4x does it in hardware (where possible), and there is no sign that it would work wrong.
Can you post code, please? I'll run it on both, Teensy and ESP.
No, I can't, it very big.
Here is a simple float/double sketch that I've run on ESP32 and Teensy 4. you can change the #if 0 to #if 1 to select float or double. Try it on your hardware.
optimization Faster on T4 and -O2 on ESP32 -- error calculation looks the same for me. ESP32 has hardware float, T4 has hardware double and float.Code:// ll2utm // double/float performance test // from Tom.java (wrong) convert.php UTM.pm #define REPS 1000 #if 0 #define FLTDBL float #define MSG "float " #define PI (3.141592653589793f) #define SIN sinf #define COS cosf #define TAN tanf #define SQRT sqrtf #define POW powf #else #define FLTDBL double #define MSG "double " #define PI (3.141592653589793) #define SIN sin #define COS cos #define TAN tan #define SQRT sqrt #define POW pow #endif FLTDBL lat = 35.0; // expect 226202E 3877157N 17 FLTDBL lon = -84.0; FLTDBL UTMEasting, UTMNorthing; FLTDBL Lat, Long; int zone; void ll2utm(FLTDBL Lat, FLTDBL Long) { //converts lat/long to UTM coords. Equations from USGS Bulletin 1532 //East Longitudes are positive, West longitudes are negative. //North latitudes are positive, South latitudes are negative //Lat and Long are in decimal degrees //Does not take into account thespecial UTM zones between 0 degrees and //36 degrees longitude above 72 degrees latitude and a special zone 32 //between 56 degrees and 64 degrees north latitude //Written by Chuck Gantz- chuck.gantz@globalstar.com FLTDBL deg2rad = PI / 180; FLTDBL rad2deg = 180.0 / PI; // FLTDBL a = 6378206.4; // nad27 // FLTDBL eccSquared = 0.006768658; FLTDBL a = 6378137; // wgs84/nad83 FLTDBL eccSquared = 0.00669438; FLTDBL k0 = 0.9996; FLTDBL LongOrigin; FLTDBL eccPrimeSquared; FLTDBL N, T, C, A, M; FLTDBL LatRad = Lat * deg2rad; FLTDBL LongRad = Long * deg2rad; FLTDBL LongOriginRad; //compute the UTM Zone from the latitude and longitude zone = (int)((Long + 180) / 6) + 1; if ( lat >= 56.0 && lat < 64.0 && lon >= 3.0 && lon < 12.0 ) zone = 32; // Special zones for Svalbard. if (lat >= 72.0 && lat < 84.0 ) { if ( lon >= 0.0 && lon < 9.0 ) zone = 31; else if ( lon >= 9.0 && lon < 21.0 ) zone = 33; else if ( lon >= 21.0 && lon < 33.0 ) zone = 35; else if ( lon >= 33.0 && lon < 42.0 ) zone = 37; } LongOrigin = ( zone - 1 ) * 6 - 180 + 3; // +3 puts origin in middle of zone LongOriginRad = LongOrigin * deg2rad; eccPrimeSquared = (eccSquared) / (1 - eccSquared); N = a / SQRT(1 - eccSquared * SIN(LatRad) * SIN(LatRad)); T = TAN(LatRad) * TAN(LatRad); C = eccPrimeSquared * COS(LatRad) * COS(LatRad); A = COS(LatRad) * (LongRad - LongOriginRad); M = a * ((1 - eccSquared / 4 - 3 * eccSquared * eccSquared / 64 - 5 * eccSquared * eccSquared * eccSquared / 256) * LatRad - (3 * eccSquared / 8 + 3 * eccSquared * eccSquared / 32 + 45 * eccSquared * eccSquared * eccSquared / 1024) * SIN(2 * LatRad) + (15 * eccSquared * eccSquared / 256 + 45 * eccSquared * eccSquared * eccSquared / 1024) * SIN(4 * LatRad) - (35 * eccSquared * eccSquared * eccSquared / 3072) * SIN(6 * LatRad)); UTMEasting = (FLTDBL)(k0 * N * (A + (1 - T + C) * A * A * A / 6 + (5 - 18 * T + T * T + 72 * C - 58 * eccPrimeSquared) * A * A * A * A * A / 120) + 500000.0); UTMNorthing = (FLTDBL)(k0 * (M + N * TAN(LatRad) * (A * A / 2 + (5 - T + 9 * C + 4 * C * C) * A * A * A * A / 24 + (61 - 58 * T + T * T + 600 * C - 330 * eccPrimeSquared) * A * A * A * A * A * A / 720))); if (Lat < 0) UTMNorthing += 10000000.0; //10000000 meter offset for southern hemisphere } void utm2ll() { // adapted from Chuck Gantz- chuck.gantz@globalstar.com FLTDBL deg2rad = PI / 180; FLTDBL rad2deg = 180.0 / PI; FLTDBL k0 = 0.9996; // FLTDBL a = 6378206.4; // nad27 // FLTDBL eccSquared = 0.006768658; FLTDBL a = 6378137; // wgs84/nad83 FLTDBL eccSquared = 0.00669438; FLTDBL eccPrimeSquared; FLTDBL e1 = (1 - SQRT(1 - eccSquared)) / (1 + SQRT(1 - eccSquared)); FLTDBL N1, T1, C1, R1, D, M; FLTDBL LongOrigin; FLTDBL mu, phi1, phi1Rad; FLTDBL x, y; FLTDBL ZoneNumber; x = UTMEasting - 500000.0; //remove 500,000 m offset for longitude y = UTMNorthing; ZoneNumber = zone; LongOrigin = (ZoneNumber - 1) * 6 - 180 + 3; //+3 puts origin in middle of zone eccPrimeSquared = (eccSquared) / (1 - eccSquared); M = y / k0; mu = M / (a * (1 - eccSquared / 4 - 3 * eccSquared * eccSquared / 64 - 5 * eccSquared * eccSquared * eccSquared / 256)); phi1Rad = mu + (3 * e1 / 2 - 27 * e1 * e1 * e1 / 32) * SIN(2 * mu) + (21 * e1 * e1 / 16 - 55 * e1 * e1 * e1 * e1 / 32) * SIN(4 * mu) + (151 * e1 * e1 * e1 / 96) * SIN(6 * mu); phi1 = phi1Rad * rad2deg; N1 = a / SQRT(1 - eccSquared * SIN(phi1Rad) * SIN(phi1Rad)); T1 = TAN(phi1Rad) * TAN(phi1Rad); C1 = eccPrimeSquared * COS(phi1Rad) * COS(phi1Rad); R1 = a * (1 - eccSquared) / POW(1 - eccSquared * SIN(phi1Rad) * SIN(phi1Rad), 1.5); D = x / (N1 * k0); Lat = phi1Rad - (N1 * TAN(phi1Rad) / R1) * (D * D / 2 - (5 + 3 * T1 + 10 * C1 - 4 * C1 * C1 - 9 * eccPrimeSquared) * D * D * D * D / 24 + (61 + 90 * T1 + 298 * C1 + 45 * T1 * T1 - 252 * eccPrimeSquared - 3 * C1 * C1) * D * D * D * D * D * D / 720); Lat = Lat * rad2deg; Long = (D - (1 + 2 * T1 + C1) * D * D * D / 6 + (5 - 2 * C1 + 28 * T1 - 3 * C1 * C1 + 8 * eccPrimeSquared + 24 * T1 * T1) * D * D * D * D * D / 120) / COS(phi1Rad); Long = LongOrigin + Long * rad2deg; } void setup() { Serial.begin(9600); while (!Serial); } void loop() { uint32_t t; t = micros(); for (int i = 0; i < REPS; i++) { ll2utm(lat, lon); utm2ll(); } t = micros() - t; // Serial.printf("%f %f %.0fE %.0fN %d %f %f %d us\n", // lat, lon, ceil(UTMEasting), ceil(UTMNorthing), zone, Lat, Long, t); float err = sqrtf((lat - Lat) * (lat - Lat) + (lon - Long) * (lon - Long)); Serial.print(MSG); Serial.print(REPS); Serial.print(" reps "); Serial.print(t); Serial.print(" us err "); Serial.println(err, 6); delay(5000); }
you can uncomment the Serial.printf() to see the lat,lon and UTM resultsCode:ESP32 -O2 float 1000 reps 30516 us err 0.000004 double 1000 reps 253389 us err 0.000000 T4 Faster float 1000 reps 4238 us err 0.000004 double 1000 reps 7758 us err 0.000000
... more performance comparisons or here
Last edited by manitou; 02-05-2022 at 07:37 PM.
@mena, looks like a bug in your program.
@Frank B, No, is the same I'm just change the microcontrollers
@manitou , what do you mean in this " ESP32 has hardware float, T4 has hardware double and float "
All Teensy LC, 3.x, and 4.x microprocessors have both 32-bit float and 64-bit double. The long double is also 64 bits. In terms of how they handle the floating point types:
- Teensy LC, 3.0, 3.1, and 3.2: both float and double are emulated in software. Floating constants without a suffix are treated as float;
- Teensy 3.5 and 3.6; float is done in hardware, double is emulated in software. Floating constants without a suffix are treated as float;
- Teensy 4.0, 4.1, and Micromod: both float and double are done in hardware. Floating constants without a suffix are treated as double.
In the original AVR Arduino processors like the 328p that is used in the Arduino UNO and the 32u4 used in the Arduino Leonardo and the Teensy 2.0/2.0++, both float and double are 32-bits, and are emulated in software.
All ARM based micro-processors should have both 32-bit float and 64-bit double types. Whether the types are emulated in software or done in hardware depends on which microprocessor is used. And whether floating point constants are considered to be 32-bit or 64-bit may depend on who set up the IDE compilation defaults.
I’ve just found that testing two 64 bit integers for equality returns true sometimes when they are different! (On Teensy 4.1)
To fix the error I had to test the lower 32 bits ‘&&’ the higher 32 bits. This fixed it!
I wonder why?
Malcolm Messiter
Please provide a runnable sketch that demonstrates the problem.
What IDE and OS are you using? what optimizations (compiler settings) are you using?
Here’s the code:
Code:void CompareModelsIDs(){ // The saved MacAddress is compared with the one just received from the model ... etc ... uint8_t SavedModelNumber = ModelNumber; if (ModelMatched) return; // must not change when model connected GotoFrontView(); RestoreBrightness(); if (ModelIdentified) { // We have both bits of Model ID? if ((ModelsMacUnion.Val32[0] == ModelsMacUnionSaved.Val32[0]) && (ModelsMacUnion.Val32[1] == ModelsMacUnionSaved.Val32[1])) // heer { if (AnnounceConnected) { if (AutoModelSelect){ PlaySound(MMMATCHED); DelayWithDog(1500); } } ModelMatched = true; // It's a match so start flying! return; } else { if (AutoModelSelect) { // It's not a match so search for it. ModelNumber = 0; while ((ModelMatched == false) && (ModelNumber < MAXMODELNUMBER - 1)) { // Try to match the ID with a saved one ++ModelNumber; ReadOneModel(ModelNumber); if ((ModelsMacUnion.Val32[0] == ModelsMacUnionSaved.Val32[0]) && (ModelsMacUnion.Val32[1] == ModelsMacUnionSaved.Val32[1])){ ModelMatched = true; } } if (ModelMatched) { // Found it! UpdateModelsNameEveryWhere(); // Use it. if (AnnounceConnected) { PlaySound(MMFOUND); DelayWithDog(1500); } SaveAllParameters(); // Save it GotoFrontView(); }else{ ModelNumber = SavedModelNumber; // Not found, so bind to the restored selected one ReadOneModel(ModelNumber); BindNow(); if (AutoModelSelect) { PlaySound(MMSAVED); DelayWithDog(1700); } } } if (!AutoModelSelect) { BindNow(); } } } } /****************
Last edited by defragster; 05-19-2023 at 09:07 PM. Reason: added CODE marking - "#" on the toolbar
The full code is over 12,000 lines.
All on GitHub “LockDownRadioControl”
Please try and develop a small test code that demonstrates the problem.
No Teensy 4 compare failures with this sketch with arduino 1.8.19 and teensyduino 1.58
Every million random numbers, "fail" is forced with b++Code:static uint64_t cnt = 0; void check(uint64_t x, uint64_t y) { if (x != y) { Serial.printf("ERROR %llx %llx cnt = %lld\n", x, y, cnt); } } void setup() { while (!Serial); uint64_t x = 0x1234567812345678; Serial.printf("x = %llx\n", x); } void loop() { uint64_t a, b; a = (uint64_t)(uint32_t)random() << 32 | (uint32_t)random(); b = a; if (++cnt % 1000000 == 0) b++; // trigger fail check(a, b); }
My similar tiny test also refused to fail!
Something as yet untraced is going on here.
I apologise if I wasted your time.
Incidentally, the unique Mac number of each Teensy 4.0 (in each model)
Is used to identify the model. Of course it’s not a full 64 bit number.
I think it’s only 48 bits stored in 64.
The transmitter loads the needed parameters on
Identifying the model - so it’s a pity to get the wrong one!
Code in msg #19 is not a valid test. The compiler is able to notice variable b was assigned from variable a, so it knows they will always be equal.
Looking at the generated assembly, loop() has only 1 conditional test, at address 4de.
You can see that conditional test just skips past the rest of the code when cnt % 1000000 is not zero. The rest of the code doesn't do any conditional test. The check() function gets inlined, and the compiler discards the comparison of a and b. In fact, you can see variable b doesn't even exist in the compiled code.Code:000004b0 <loop>: void loop() { 4b0: b5f0 push {r4, r5, r6, r7, lr} 4b2: b085 sub sp, #20 uint64_t a, b; a = (uint64_t)(uint32_t)random() << 32 | (uint32_t)random(); 4b4: f002 fe7e bl 31b4 <random> 4b8: 4607 mov r7, r0 4ba: f002 fe7b bl 31b4 <random> b = a; if (++cnt % 1000000 == 0) b++; // trigger fail 4be: 4910 ldr r1, [pc, #64] ; (500 <loop+0x50>) 4c0: 4a10 ldr r2, [pc, #64] ; (504 <loop+0x54>) 4c2: e9d1 5400 ldrd r5, r4, [r1] 4c6: 3501 adds r5, #1 4c8: f144 0400 adc.w r4, r4, #0 4cc: e9c1 5400 strd r5, r4, [r1] a = (uint64_t)(uint32_t)random() << 32 | (uint32_t)random(); 4d0: 4606 mov r6, r0 if (++cnt % 1000000 == 0) b++; // trigger fail 4d2: 2300 movs r3, #0 4d4: 4628 mov r0, r5 4d6: 4621 mov r1, r4 4d8: f001 fc8a bl 1df0 <__aeabi_uldivmod> 4dc: 4313 orrs r3, r2 4de: d10c bne.n 4fa <loop+0x4a> 4e0: 1c73 adds r3, r6, #1 Serial.printf("ERROR %llx %llx cnt = %lld\n", x, y, cnt); 4e2: 9300 str r3, [sp, #0] if (++cnt % 1000000 == 0) b++; // trigger fail 4e4: f147 0300 adc.w r3, r7, #0 Serial.printf("ERROR %llx %llx cnt = %lld\n", x, y, cnt); 4e8: 9301 str r3, [sp, #4] 4ea: 4907 ldr r1, [pc, #28] ; (508 <loop+0x58>) 4ec: 4807 ldr r0, [pc, #28] ; (50c <loop+0x5c>) 4ee: 9502 str r5, [sp, #8] 4f0: 9403 str r4, [sp, #12] 4f2: 4632 mov r2, r6 4f4: 463b mov r3, r7 4f6: f000 ffdb bl 14b0 <Print::printf(char const*, ...)> check(a, b); 4fa: b005 add sp, #20 4fc: bdf0 pop {r4, r5, r6, r7, pc} 4fe: bf00 nop 500: 1fff0eb8 .word 0x1fff0eb8 504: 000f4240 .word 0x000f4240 508: 00008bdc .word 0x00008bdc 50c: 1fff0738 .word 0x1fff0738
The compiler optimizer is very good at following data dependency and removing redundant code!
I can take a quick look. But only quickly.
Please give me a link to your github repository? If it has more than 1 branch, tell me exactly which code I need to check out.
Please also be specific with exactly which line number in which file has the conditional test you believe the compiler is not implementing properly?
I will try compiling your code and then find at place in the generated assembly. I can do this pretty quickly. But I'm not going to hunting and searching just to find your code on github and then try to match it up to the code sample you gave on this thread. I need you to be 100% clear about the exact place in the 12000 lines you believe is wrongly implemented.
It’s ok! I have traced the error. It was mine. But not in the code! It was operator error! And I was the operator! Because I had moved receivers around between models, of course the Teensy 4.0 complete with its unique MAC address moved too. I’d forgotten i’d moved it. So the transmitter wrongly identified the model. All is now under control and I apologise for being an idiot!
Dang, though the question is mute, maybe using static volatile for a and b will avoid the compiler optimizations;Code in msg #19 is not a valid test. The compiler is able to notice variable b was assigned from variable a, so it knows they will always be equal.
Code:static uint64_t cnt = 0; static volatile uint64_t a, b; void check(uint64_t x, uint64_t y) { if (x != y) { Serial.printf("ERROR %llx %llx cnt = %lld\n", x, y, cnt); } } void setup() { while (!Serial); uint64_t x = 0x1234567812345678; Serial.printf("x = %llx\n", x); } void loop() { a = (uint64_t)(uint32_t)random() << 32 | (uint32_t)random(); b = a; if (++cnt % 1000000 == 0) b++; // trigger fail check(a, b); }The test sketch also suffers from: "absence of proof is not proof of absence."Code:check(a, b); 112: e9d7 2300 ldrd r2, r3, [r7] 116: e9d6 6700 ldrd r6, r7, [r6] if (x != y) { 11a: 42bb cmp r3, r7 11c: bf08 it eq 11e: 42b2 cmpeq r2, r6 120: d007 beq.n 132 <loop+0x72> Serial.printf("ERROR %llx %llx cnt = %lld\n", x, y, cnt);
Last edited by manitou; 05-20-2023 at 10:58 AM.
Glad you found the problem.
Yes, that seems to work. Easy to see the full 64 bits really are compared, and nice to see the compiler makes use of the IT (If Then) instruction for efficient usage of the CPU's pipeline.
Just adding volatile on the local variables might also be enough.