defragster
Senior Member+
@mjs513 - link fail .. … Re: Benchmark STM32 vs ATMega328 (na...6 (teensy 3.2) - URL not there.
@mjs513 - link fail .. … Re: Benchmark STM32 vs ATMega328 (na...6 (teensy 3.2) - URL not there.
@PaulStoffregen and others
While poking STM32duino looking for stuff on CANFD I came across a couple of benchmarking sketches: Dhrystone, Whetstone-double precision and Whetstone-single precision. I copied them over and ran them on the T4B2 they did compile and run no issue. Just not sure what they all may mean - never saw them before except for PCs. I am attaching for reference but I would like to get a reading (Paul) on the validity for using them with the Teensies.
View attachment 16552
I also found this post on the Arduino Forum Re: Benchmark STM32 vs ATMega328 (nano) vs SAM3X8E (due) vs MK20DX256 (teensy 3.2) It also has data for T3.5 as well.
FWIW running the current code on the same Teensy's would give a feel for adjustments needed? And might be a cool indicator of the effects of the changes.
I noticed in the case of Telemetry - it boiled the Serial.print down to a single string print using snprint() - which I saw make a huge diff on the T4
Didn't find any old TelemView config files and they have a 'test' which would be perfect base - and give Arduino code for that - but I didn't figure out the graph drawing and get a saved config yet. Opened the UtUbe video to watch the UPDATED TelemetryViewer_v0.5.jar version - but watched some WWII movies instead
dhrystone has been discussed several times on teensy forum, but coremark is probably a better replacement. @MichaelMeissner has chatted about whetstone
https://forum.pjrc.com/threads/34808-K66-Beta-Test?p=119119&viewfull=1#post119119
https://forum.pjrc.com/threads/55481-Generated-Code-of-teensy3-6?p=201032&viewfull=1#post201032
thanks for porting whetstone to a teensy sketch. somewhere I have the old FORTRAN code ...
I provided the Teensy 3* benchmark data to the arduino forum you mention. I have the T4 data, but due to non-disclosure did not provide that info to that arduino thread.
I have some T4 floating point performance data at https://forum.pjrc.com/threads/54711-Teensy-4-0-First-Beta-Test?p=194187&viewfull=1#post194187
I added whetstone numbers.
Due (?)
C Converted Single Precision Whetstones: 7.80 MIPS
C Converted Double Precision Whetstones: 5.40 MIPS
STM32H743 (480Mhz?)
C Converted Single Precision Whetstones: 471.70 MIPS
C Converted Double Precision Whetstones: 67.16 MIPS
0,11,189,0.1564344615
0,11,190,0.17364
void setup() {
Serial.begin(9600);
}
uint32_t count=0;
void loop() {
count++;
uint32_t waveform_a = ((count&0x3ff00) >>8);
uint32_t waveform_b = ((count&0x3ffF0) >>4);
uint32_t waveform_c = ((count&0x3ff));
float sine_wave_1khz = sin(radians( count%90 ));
char sine_wave_1khz_text[30];
dtostrf(sine_wave_1khz, 10, 10, sine_wave_1khz_text);
char text[124];
snprintf(text, 124, "%d,%d,%d,%s", (int)waveform_a, (int)waveform_b, (int)waveform_c, sine_wave_1khz_text);
Serial.println(text);
delayMicroseconds(10);
}
uint32_t prior_count;
uint32_t prior_msec;
uint32_t count_per_second;
uint32_t count = 0;
uint32_t msec;
void setup() {
prior_count = count;
count_per_second = 0;
prior_msec = millis();
msec = micros()/1000;
Serial.begin(9600);
Serial2.begin(2000000);
Serial2.println(count_per_second);
delay(1000);
}
void loop() {
count++;
uint32_t waveform_a = ((count & 0x3ff00) >> 8);
uint32_t waveform_b = ((count & 0x3ffF0) >> 4);
uint32_t waveform_c = ((count & 0x3ff));
float sine_wave_1khz = sin(radians( count % 90 ));
char sine_wave_1khz_text[30];
dtostrf(sine_wave_1khz, 10, 10, sine_wave_1khz_text);
char text[124];
snprintf(text, 124, "%u,%u,%u,%s", (int)waveform_a, (int)waveform_b, (int)waveform_c, sine_wave_1khz_text);
Serial.println(text);
delayMicroseconds(10);
msec = millis();
if (msec - prior_msec > 1000) {
prior_msec = prior_msec + 1000;
count_per_second = count - prior_count;
prior_count = count;
Serial2.print(count_per_second);
Serial2.println('.');
}
}
Something very fragile in the balance
Test Package:
lps_test.exe : for Windows - runs to receive 100 blocks of 32KB
>> Param1 :: serial port like 'Com25'
>> Param2 :: 0 for 100000000 block loops
>> Param2 :: 1 for 1000000 block loops
>> Param2 :: 2-9 for # of 100 block loops
OUTPUT :: reads COM# block of data and shows one string per block
lps_test.c : source for precompiled EXE - could take mods for non-Windows
bld2.bat : cmdline compile for gcc
EchoBoth : Sketch for second Teensy.
>> Echos Serial1 input at 115200 baud
>> Read FreqCount from pin13
TelemViewFast : sketch for T4 to demo Lines Per Second output
>> Serial2 connects 115200
>> pinMode(14, OUTPUT); // Toggle w/each output string for FreqCount
>> uint32_t usDelay = 10; // start us delay between transmit
>> //#define DELAY_DROP 1 // comment to keep a static delay
Usage: Wire Serial as noted above, and FreqCount pin
Put Echoboth on a Teensy
Put TelemViewFast on Target Teensy
From Windows cmdline run for device port:: lps_test COM25
OPTIONAL: when working the output is formatted for TelemetryViewer graphing. Download the JAR from 'http://www.farrellf.com/TelemetryViewer/' and get it running with proper JAVA runtime. 'Layout' is in file : "RealTest20.txt"
fCnt=89799
fCnt=88009
fCnt=87766
fCnt=33997
fCnt=0
fCnt=78273
fCnt=89179
#13944[32K] : __>> [COLOR="#FF0000"]1,16360,649,81[/COLOR] <<__
#13945[32K] : __>> [COLOR="#FF0000"]1,129,16,89[/COLOR] <<__
fCnt=87889
fCnt=27343
fCnt=0
fCnt=87459
fCnt=87814
fCnt=68860
fCnt=0
fCnt=17163
fCnt=24399
fCnt=26972
fCnt=8832
fCnt=0
fCnt=19187
fCnt=17665
fCnt=25228
void setup() {
Serial.begin(115200);
Serial.flush();
}
void loop() {
if (Serial.available()) {
byte c = Serial.read();
if (c == 'x') { // 'x' is end of input message
Serial.write('0');
Serial.write('1');
Serial.write('2');
Serial.write('x');
Serial.send_now(); // comment out for non-Teensy boards
}
}
}
T:\T_Downloads\pjrc_latency_test>latency_test COM25
port COM25 opened
waiting for board to be ready:
.ok
latency @ 1 bytes: 0.00 ms average, 0 max hits, [B]100.00 2nd max[/B], 0.00 maximum
latency @ 2 bytes: 0.00 ms average, 0 max hits, 100.00 2nd max, 0.00 maximum
latency @ 12 bytes: 0.16 ms average, 1 max hits, 0.00 2nd max, 15.61 maximum
latency @ 16 bytes: 0.00 ms average, 0 max hits, 100.00 2nd max, 0.00 maximum
latency @ 30 bytes: 0.22 ms average, 1 max hits, 0.00 2nd max, 15.62 maximum
latency @ 31 bytes: 0.00 ms average, 0 max hits, 100.00 2nd max, 0.00 maximum
latency @ 63 bytes: 0.16 ms average, 1 max hits, 0.00 2nd max, 15.63 maximum
latency @ 64 bytes: 0.16 ms average, 1 max hits, 0.00 2nd max, 15.62 maximum
latency @ 65 bytes: 0.00 ms average, 0 max hits, 100.00 2nd max, 0.00 maximum
latency @ 71 bytes: 0.00 ms average, 0 max hits, 100.00 2nd max, 0.00 maximum
latency @ 126 bytes: 0.16 ms average, 1 max hits, 0.00 2nd max, 15.62 maximum
latency @ 127 bytes: 0.16 ms average, 1 max hits, 0.00 2nd max, 15.62 maximum
latency @ 128 bytes: 0.00 ms average, 0 max hits, 100.00 2nd max, 0.00 maximum
latency @ 129 bytes: 0.22 ms average, 1 max hits, 0.00 2nd max, 15.62 maximum
latency @ 500 bytes: 0.16 ms average, 1 max hits, 0.00 2nd max, 15.62 maximum
latency @ 512 bytes: 0.31 ms average, 2 max hits, 15.62 2nd max, 15.62 maximum
latency @ 640 bytes: 0.31 ms average, 2 max hits, 15.62 2nd max, 15.67 maximum
latency @ 1000 bytes: 0.38 ms average, 2 max hits, 15.59 2nd max, 15.63 maximum
latency @ 1278 bytes: 0.47 ms average, 2 max hits, 15.65 2nd max, 15.65 maximum
latency @ 1279 bytes: 0.53 ms average, 3 max hits, 15.62 2nd max, 15.67 maximum
latency @ 1280 bytes: 0.62 ms average, 3 max hits, 15.63 2nd max, 15.70 maximum
latency @ 1281 bytes: 0.38 ms average, 2 max hits, 15.65 2nd max, 15.65 maximum
latency @ 2000 bytes: 0.69 ms average, 2 max hits, 15.65 2nd max, 15.74 maximum
latency @ 2047 bytes: 0.78 ms average, 1 max hits, 0.00 2nd max, 15.81 maximum
latency @ 2048 bytes: 0.84 ms average, 1 max hits, 0.00 2nd max, 15.76 maximum
latency @ 2049 bytes: 0.69 ms average, 3 max hits, 15.72 2nd max, 15.77 maximum
latency @ 4000 bytes: 1.31 ms average, 2 max hits, 15.82 2nd max, 15.83 maximum
latency @ 4095 bytes: 1.22 ms average, 3 max hits, 15.43 2nd max, 15.80 maximum
latency @ 4096 bytes: 1.47 ms average, 2 max hits, 15.81 2nd max, 15.81 maximum
latency @ 4097 bytes: 1.31 ms average, 2 max hits, 15.71 2nd max, 15.79 maximum
latency @ 8000 bytes: 2.39 ms average, 2 max hits, 6.65 2nd max, 15.85 maximum
UP ----- pass #1 elapsed time 1.588 secs for 4106700 bytes
latency @ 8000 bytes: 2.31 ms average, 3 max hits, 15.80 2nd max, 15.84 maximum
latency @ 4097 bytes: 1.22 ms average, 3 max hits, 15.63 2nd max, 15.77 maximum
latency @ 4096 bytes: 1.32 ms average, 4 max hits, 15.65 2nd max, 15.71 maximum
latency @ 4095 bytes: 1.32 ms average, 3 max hits, 15.70 2nd max, 15.72 maximum
latency @ 4000 bytes: 1.22 ms average, 4 max hits, 15.68 2nd max, 15.75 maximum
latency @ 2049 bytes: 0.78 ms average, 2 max hits, 15.63 2nd max, 15.78 maximum
latency @ 2048 bytes: 0.85 ms average, 3 max hits, 15.63 2nd max, 15.63 maximum
latency @ 2047 bytes: 0.69 ms average, 3 max hits, 15.63 2nd max, 15.78 maximum
latency @ 2000 bytes: 0.69 ms average, 2 max hits, 15.46 2nd max, 15.66 maximum
latency @ 1281 bytes: 0.63 ms average, 4 max hits, 15.68 2nd max, 15.68 maximum
latency @ 1280 bytes: 0.53 ms average, 2 max hits, 15.48 2nd max, 15.70 maximum
latency @ 1279 bytes: 0.47 ms average, 2 max hits, 15.58 2nd max, 15.70 maximum
latency @ 1278 bytes: 0.53 ms average, 2 max hits, 15.62 2nd max, 15.63 maximum
latency @ 1000 bytes: 0.47 ms average, 3 max hits, 15.62 2nd max, 15.64 maximum
latency @ 640 bytes: 0.31 ms average, 2 max hits, 15.60 2nd max, 15.71 maximum
latency @ 512 bytes: 0.22 ms average, 2 max hits, 6.50 2nd max, 15.62 maximum
latency @ 500 bytes: 0.16 ms average, 1 max hits, 0.00 2nd max, 15.63 maximum
latency @ 129 bytes: 0.16 ms average, 1 max hits, 0.00 2nd max, 15.62 maximum
latency @ 128 bytes: 0.16 ms average, 1 max hits, 0.00 2nd max, 15.62 maximum
latency @ 127 bytes: 0.16 ms average, 1 max hits, 0.00 2nd max, 15.62 maximum
latency @ 126 bytes: 0.00 ms average, 0 max hits, 100.00 2nd max, 0.00 maximum
latency @ 71 bytes: 0.07 ms average, 1 max hits, 0.00 2nd max, 6.50 maximum
latency @ 65 bytes: 0.16 ms average, 1 max hits, 0.00 2nd max, 15.63 maximum
latency @ 64 bytes: 0.00 ms average, 0 max hits, 100.00 2nd max, 0.00 maximum
latency @ 63 bytes: 0.16 ms average, 1 max hits, 0.00 2nd max, 15.62 maximum
latency @ 31 bytes: 0.16 ms average, 1 max hits, 0.00 2nd max, 15.62 maximum
latency @ 30 bytes: 0.00 ms average, 0 max hits, 100.00 2nd max, 0.00 maximum
latency @ 16 bytes: 0.16 ms average, 1 max hits, 0.00 2nd max, 15.62 maximum
latency @ 12 bytes: 0.16 ms average, 1 max hits, 0.00 2nd max, 15.62 maximum
latency @ 2 bytes: 0.00 ms average, 0 max hits, 100.00 2nd max, 0.00 maximum
latency @ 1 bytes: 0.07 ms average, 1 max hits, 0.00 2nd max, 6.51 maximum
DOWN --- pass #1 elapsed time 1.572 secs for 4106700 bytes
count=86447975, lines/sec=451763
count=86447976, lines/sec=451763
count=86447977, lines/sec=451763
count=86447978, lines/sec=451763
count=86447979, lines/sec=451763
count=86447980, lines/sec=451763
I comment out first line in printf.h to disable debug printf, and now T4B2 sketches won't upload unless I press the program button. If I uncomment the first line in printf.h, then uploads/run work again.
Now all you need to do is to rewrite the Java code
Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: Java heap space
Are you seeing the Java heap space run out on Windows, or any other system?