Teensy 3.6 256MHz

Status
Not open for further replies.

Frank B

Senior Member
[link deleted]

If you want to try it (please!), download the core and overwrite your local files.
Some libraries need updates, but the needed changes to the core are done.
Default F_BUS is 64MHz (128MHz optional, not tested), flash 32MHz.

It works for me (so far) with all my Teensy 3.6. Could not test much.. that's a "TODO" :)

A new boards.txt for 256MHz overclock is attached to this post.

Edit: Paul Stoffregen merged my pullrequest, it will be in Teensyduino 1.45 (or download it here: https://github.com/PaulStoffregen/cores)
 

Attachments

  • boards.txt
    51.1 KB · Views: 157
Last edited:
Open Serial _FE/GPS test sketch worked at 60==f_bus but 128 hits this:
T:\arduino-1.8.7_144_TYC\hardware\teensy\avr\cores\teensy3\analog.c:130:2: error: #error "F_BUS must be 120, 108, 96, 90, 80, 72, 64, 60, 56, 54, 48, 40, 36, 24, 4 or 2 MHz"

#error "F_BUS must be 120, 108, 96, 90, 80, 72, 64, 60, 56, 54, 48, 40, 36, 24, 4 or 2 MHz"
 
Oh.. some details are missing :) I'll fix that tomorrow. It's too late to think in these lands. (And perhaps too much Coke, Bacardi and ice)
 
Seems to actually be running at the advanced speed! Seeing 255,997,016 cpu cycles per second. No issues on multiple compiles and uploads.

I found a sketch that adjusts the RTC, and since I have the GPS with PPS on I got that running.

The GPS PPS signal interrupts once per second.

At 180 MHZ the current Cycle Count diff on PPS arrival shows as:
New PPS:: 179997989
New PPS:: 179997976

Bumping to 256 MHz and it is showing:
New PPS:: 255997016
New PPS:: 255997029

This code uses the RTC interrupt and one for the PPS pin and then is in theory adjusting the RTC timing to get it to match. So that hardware is all working and the cycle counts show the match is working.

And at the older slower 240 MHz:
New PPS:: 239997281
New PPS:: 239997267

It seems this code doesn't actually tune the clock as fast as it should be though. This is a different T_3.6 - and it is missing 2700 clocks per second - which is what I saw on my other GPS system when testing, so it must have adjusted that one too?
 
Ok, i've added the fixes for f_bus 128MHz.

Hm I think we have to do more tests? tomorrow.
Thank you, Tim.
 
Thanks Frank, I'll wait for a bit to update to try the f_bus==128 to see it run at 64

The above linked sketch claims to be doing tuning - but it seems to be missing a key element in the searches as I've looked so far. The linked code is for T_3.2? Same registers on T_3.6?

<edit>: had another look - that indeed is current code - will need to see how to use it
 
Last edited:
Not that this was the important part of your OP Frank, but to follow up - all indications are that it is working at 256 MHz!
Putting this:
Code:
void setup() {
  OSC0_CR = 0x02;
  // ...

Against my Adafruit GPS PPS In the code results in OSC0_CR = 0x02 best across these F_CPU's:
OSC0_CR = 0x02
F_CPU = 256000000
New PPS:: 255999580

And 180:
OSC0_CR = 0x02
F_CPU = 180000000
New PPS:: 179999699

And 240:
OSC0_CR = 0x02
F_CPU = 240000000
New PPS:: 239999680

This must be in (setup) each time to have effect, and it is 7+ times fewer missed clock ticks in the following than no adjustment.

I got the value of 2 from K66 Beta and it is closest single value. I just hacked this in the code I was using. [Reading that post linked in p#7 I noticed then it didn't actually do anything.]

Made a fuller list of the cycle counts per PPS across the bits 0-3 into OSC0_CR - as TelephoneBill found on his Beta K66, 0x02 a bit low, and 0x0C the one least high to dither with:
Code:
F_CPU = 256000000
New PPS:: 255997111
[U]New PPS:: 255997128   OSC0_CR = NONE [/U]

New PPS:: 256018700
New PPS:: 256018700   OSC0_CR = 0

New PPS:: 255991384
New PPS:: 255991380   OSC0_CR = 1

New PPS:: 255999640
[B]New PPS:: 255999632   OSC0_CR = 2[/B]

New PPS:: 255986652
New PPS:: 255986648   OSC0_CR = 3

New PPS:: 256006796
New PPS:: 256006800   OSC0_CR = 4

New PPS:: 255988772
New PPS:: 255988772   OSC0_CR = 5

New PPS:: 255994888
New PPS:: 255994900   OSC0_CR = 6

New PPS:: 255984956
New PPS:: 255984944   OSC0_CR = 7

New PPS:: 256011956
New PPS:: 256011956   OSC0_CR = 8

New PPS:: 255990004
New PPS:: 255990000   OSC0_CR = 9

New PPS:: 255997088
New PPS:: 255997092   OSC0_CR = 10

New PPS:: 255985764
New PPS:: 255985764   OSC0_CR = 11

New PPS:: 256002908
[B]New PPS:: 256002908   OSC0_CR = 12[/B]

New PPS:: 255987684
New PPS:: 255987684   OSC0_CR = 13

New PPS:: 255993080
New PPS:: 255993084   OSC0_CR = 14

New PPS:: 255984224
New PPS:: 255984220   OSC0_CR = 15
 
Maybe Paul can adjust OSC0_CR now.

A T3.6 has run the "DHRYSTONE" benchmark program with 256MHz all night long. No Problems!
 
I've put this on my list of stuff to look at for 1.45.

Please understand I am currently only working on very serious bugs at this time. All other dev cycles are going to the new hardware for early 2019.

The recently reported Serial1.end() bug that causes a complete serial lockup was an example of an issue I consider serious enough. Everything else is going on a list to do later, perhaps around mid-2019 for things like this which are minor issues.
 
Maybe Paul can adjust OSC0_CR now.
I'm not sure there is a "best" value for OSC0_CR as the crystal characteristics will vary from chip to chip and temperature will alter the crystal frequency. Testing one of my T3.6 with GPS PPS, the best OSC0_CR is the default 0x8a (10pf) measuring -5 ppm. On a T3.2, the best is 0x8c (6pf) at 2ppm, and a T3.5 the best is 0x8c (6pf) at 0 ppm. On my early beta T3.6, best is default 0x8a at -4 ppm. On another T3.6 0x8a is -5ppm and 0x82 is 5 ppm. On another T3.5, 0x8c is best measuring -1 ppm.
 
Last edited:
An anecdote by the way:
Soon it will be carnival time. In Germany, the number of carnival is the 11.
Carnival begins e.g. on 11.11, at exactly 11:11 o'clock.
Such a luck, because the Teensy 3.6 reaches with 256MHZ, optimization "Fastest + pure-code with LTO" pretty much exactly 1111111 Dhrystones.

@manitou: It's a WCO.
(Weather Controlled Oscillator)
:eek:
 
Last edited:
An anecdote by the way:
Such a luck, because the Teensy 3.6 reaches with 256MHZ, optimization "Fastest + pure-code with LTO" pretty much exactly 1111111 Dhrystones.

(Weather Controlled Oscillator)
:eek:
:eek: … Adjusting the clock will ruin that 1111111 value - or the wrong temp :)

Maybe Paul can adjust OSC0_CR now.
I was wondering about that - but wider testing would help and would need a trusted clock - and many GPS units don't present a PPS pin even if user has one. Perhaps a standard .HEX file like Frank's Dhrystone value?

Manual has this ?
28.8.1.1 OSC Control Register (OSC_CR)
NOTE
After OSC is enabled and starts generating the clocks, the configurations such as low power and frequency range, must not be changed.
This would affect SNOOZE and maybe EEPROM dropping HSRUN?

I'm not sure there is a "best" value for OSC0_CR as the crystal characteristics will vary from chip to chip and temperature will alter the crystal frequency. Testing one of my T3.6 with GPS PPS, the best OSC0_CR is the default 0x8a (10pf) measuring -5 ppm. On a T3.2, the best is 0x8c (6pf) at 2ppm, and a T3.5 the best is 0x8c (6pf) at 0 ppm. On my early beta T3.6, best is default 0x8a at -4 ppm. On another T3.6 0x8a is -5ppm and 0x82 is 5 ppm. On another T3.5, 0x8c is best measuring -1 ppm.
'best' was relative to it being worse without the change - where the rate seemed the same as the 1 or 2 other T_3.6's I had that were showing a similar low default value. When running that a day in the window with varying temp the range was IIRC ~500 cycles over some room temp range.
Only the low 4 bits affect capacitance - the 0x80 bit changes to external reference - and I just saw that 0x82 is the same as 0x02 on this T_3.6.
 
:eek: … Adjusting the clock will ruin that 1111111 value - or the wrong temp :)


I was wondering about that - but wider testing would help and would need a trusted clock - and many GPS units don't present a PPS pin even if user has one. Perhaps a standard .HEX file like Frank's Dhrystone value?

Here it is :)
A HEX-File would run on a specific Teensy model only.

I dont' remember where I found it.
The first version of "DHRYSTONE" is from 1984.

Nowadays, DHRYSTONE is not a good benchmark (Google...). But it's good enough to show that -O2 is not allways the best choice**. Indeed, this gives the best numbers with all optimizations turned on.
(And I noticed that "Huston, we have a problem!!" optimizing for size has a serious problem (Crash?). Must be something in the core.. (USB?) but I'm too lazy to look for the reason.)

Code:
/*
 ****************************************************************************

                     "DHRYSTONE" Benchmark Program
                     -----------------------------

    Version:    C, Version 2.1

    File:       dhry.h (part 1 of 3)

    Date:       May 25, 1988

    Author:     Reinhold P. Weicker
                        Siemens AG, E STE 35
                        Postfach 3240
                        8520 Erlangen
                        Germany (West)
                                Phone:  [xxx-49]-9131-7-20330
                                        (8-17 Central European Time)
                                Usenet: ..!mcvax!unido!estevax!weicker

                Original Version (in Ada) published in
                "Communications of the ACM" vol. 27., no. 10 (Oct. 1984),
                pp. 1013 - 1030, together with the statistics
                on which the distribution of statements etc. is based.

                In this C version, the following C library functions are used:
                - strcpy, strcmp (inside the measurement loop)
                - Serial.printf, scanf (outside the measurement loop)
                In addition, Berkeley UNIX system calls "times ()" or "time ()"
                are used for execution time measurement. For measurements
                on other systems, these calls have to be changed.

    Collection of Results:
                Reinhold Weicker (address see above) and

                Rick Richardson
                PC Research. Inc.
                94 Apple Orchard Drive
                Tinton Falls, NJ 07724
                        Phone:  (201) 389-8963 (9-17 EST)
                        Usenet: ...!uunet!pcrat!rick

        Please send results to Rick Richardson and/or Reinhold Weicker.
        Complete information should be given on hardware and software used.
        Hardware information includes: Machine type, CPU, type and size
        of caches; for microprocessors: clock frequency, memory speed
        (number of wait states).
        Software information includes: Compiler (and runtime library)
        manufacturer and version, compilation switches, OS version.
        The Operating System version may give an indication about the
        compiler; Dhrystone itself performs no OS calls in the measurement loop.

        The complete output generated by the program should be mailed
        such that at least some checks for correctness can be made.

 ***************************************************************************

    History:    This version C/2.1 has been made for two reasons:

                1) There is an obvious need for a common C version of
                Dhrystone, since C is at present the most popular system
                programming language for the class of processors
                (microcomputers, minicomputers) where Dhrystone is used most.
                There should be, as far as possible, only one C version of
                Dhrystone such that results can be compared without
                restrictions. In the past, the C versions distributed
                by Rick Richardson (Version 1.1) and by Reinhold Weicker
                had small (though not significant) differences.

                2) As far as it is possible without changes to the Dhrystone
                statistics, optimizing compilers should be prevented from
                removing significant statements.

                This C version has been developed in cooperation with
                Rick Richardson (Tinton Falls, NJ), it incorporates many
                ideas from the "Version 1.1" distributed previously by
                him over the UNIX network Usenet.
                I also thank Chaim Benedelac (National Semiconductor),
                David Ditzel (SUN), Earl Killian and John Mashey (MIPS),
                Alan Smith and Rafael Saavedra-Barrera (UC at Berkeley)
                for their help with comments on earlier versions of the
                benchmark.

    Changes:    In the initialization part, this version follows mostly
                Rick Richardson's version distributed via Usenet, not the
                version distributed earlier via floppy disk by Reinhold Weicker.
                As a concession to older compilers, names have been made
                unique within the first 8 characters.
                Inside the measurement loop, this version follows the
                version previously distributed by Reinhold Weicker.

                At several places in the benchmark, code has been added,
                but within the measurement loop only in branches that
                are not executed. The intention is that optimizing compilers
                should be prevented from moving code out of the measurement
                loop, or from removing code altogether. Since the statements
                that are executed within the measurement loop have NOT been
                changed, the numbers defining the "Dhrystone distribution"
                (distribution of statements, operand types and locality)
                still hold. Except for sophisticated optimizing compilers,
                execution times for this version should be the same as
                for previous versions.

                Since it has proven difficult to subtract the time for the
                measurement loop overhead in a correct way, the loop check
                has been made a part of the benchmark. This does have
                an impact - though a very minor one - on the distribution
                statistics which have been updated for this version.

                All changes within the measurement loop are described
                and discussed in the companion paper "Rationale for
                Dhrystone version 2".

                Because of the self-imposed limitation that the order and
                distribution of the executed statements should not be
                changed, there are still cases where optimizing compilers
                may not generate code for some statements. To a certain
                degree, this is unavoidable for small synthetic benchmarks.
                Users of the benchmark are advised to check code listings
                whether code is generated for all statements of Dhrystone.

                Version 2.1 is identical to version 2.0 distributed via
                the UNIX network Usenet in March 1988 except that it corrects
                some minor deficiencies that were found by users of version 2.0.
                The only change within the measurement loop is that a
                non-executed "else" part was added to the "if" statement in
                Func_3, and a non-executed "else" part removed from Proc_3.

 ***************************************************************************

   Defines:     The following "Defines" are possible:
                -DREG=register          (default: Not defined)
                        As an approximation to what an average C programmer
                        might do, the "register" storage class is applied
                        (if enabled by -DREG=register)
                        - for local variables, if they are used (dynamically)
                          five or more times
                        - for parameters if they are used (dynamically)
                          six or more times
                        Note that an optimal "register" strategy is
                        compiler-dependent, and that "register" declarations
                        do not necessarily lead to faster execution.
                -DNOSTRUCTASSIGN        (default: Not defined)
                        Define if the C compiler does not support
                        assignment of structures.
                -DNOENUMS               (default: Not defined)
                        Define if the C compiler does not support
                        enumeration types.
                -DTIMES                 (default)
                -DTIME
                        The "times" function of UNIX (returning process times)
                        or the "time" function (returning wallclock time)
                        is used for measurement.
                        For single user machines, "time ()" is adequate. For
                        multi-user machines where you cannot get single-user
                        access, use the "times ()" function. If you have
                        neither, use a stopwatch in the dead of night.
                        "Serial.printf"s are provided marking the points "Start Timer"
                        and "Stop Timer". DO NOT use the UNIX "time(1)"
                        command, as this will measure the total time to
                        run this program, which will (erroneously) include
                        the time to allocate storage (malloc) and to perform
                        the initialization.
                -DHZ=nnn
                        In Berkeley UNIX, the function "times" returns process
                        time in 1/HZ seconds, with HZ = 60 for most systems.
                        CHECK YOUR SYSTEM DESCRIPTION BEFORE YOU JUST APPLY
                        A VALUE.

 ***************************************************************************

    Compilation model and measurement (IMPORTANT):

    This C version of Dhrystone consists of three files:
    - dhry.h (this file, containing global definitions and comments)
    - dhry_1.c (containing the code corresponding to Ada package Pack_1)
    - dhry_2.c (containing the code corresponding to Ada package Pack_2)

    The following "ground rules" apply for measurements:
    - Separate compilation
    - No procedure merging
    - Otherwise, compiler optimizations are allowed but should be indicated
    - Default results are those without register declarations
    See the companion paper "Rationale for Dhrystone Version 2" for a more
    detailed discussion of these ground rules.

    For 16-Bit processors (e.g. 80186, 80286), times for all compilation
    models ("small", "medium", "large" etc.) should be given if possible,
    together with a definition of these models for the compiler system used.

 **************************************************************************

    Dhrystone (C version) statistics:

    [Comment from the first distribution, updated for version 2.
     Note that because of language differences, the numbers are slightly
     different from the Ada version.]

    The following program contains statements of a high level programming
    language (here: C) in a distribution considered representative:

      assignments                  52 (51.0 %)
      control statements           33 (32.4 %)
      procedure, function calls    17 (16.7 %)

    103 statements are dynamically executed. The program is balanced with
    respect to the three aspects:

      - statement type
      - operand type
      - operand locality
           operand global, local, parameter, or constant.

    The combination of these three aspects is balanced only approximately.

    1. Statement Type:
    -----------------             number

       V1 = V2                     9
         (incl. V1 = F(..)
       V = Constant               12
       Assignment,                 7
         with array element
       Assignment,                 6
         with record component
                                  --
                                  34       34

       X = Y +|-|"&&"|"|" Z        5
       X = Y +|-|"==" Constant     6
       X = X +|- 1                 3
       X = Y *|/ Z                 2
       X = Expression,             1
             two operators
       X = Expression,             1
             three operators
                                  --
                                  18       18

       if ....                    14
         with "else"      7
         without "else"   7
             executed        3
             not executed    4
       for ...                     7  |  counted every time
       while ...                   4  |  the loop condition
       do ... while                1  |  is evaluated
       switch ...                  1
       break                       1
       declaration with            1
         initialization
                                  --
                                  34       34

       P (...)  procedure call    11
         user procedure      10
         library procedure    1
       X = F (...)
               function  call      6
         user function        5
         library function     1
                                  --
                                  17       17
                                          ---
                                          103

      The average number of parameters in procedure or function calls
      is 1.82 (not counting the function values aX

    2. Operators
    ------------
                            number    approximate
                                      percentage

      Arithmetic             32          50.8

         +                     21          33.3
         -                      7          11.1
 *       *                      3           4.8
         / (int div)            1           1.6

      Comparison             27           42.8

         ==                     9           14.3
         /=                     4            6.3
         >                      1            1.6
         <                      3            4.8
         >=                     1            1.6
         <=                     9           14.3

      Logic                   4            6.3

         && (AND-THEN)          1            1.6
         |  (OR)                1            1.6
         !  (NOT)               2            3.2

                             --          -----
                             63          100.1


    3. Operand Type (counted once per operand reference):
    ---------------
                            number    approximate
                                      percentage

       Integer               175        72.3 %
       Character              45        18.6 %
       Pointer                12         5.0 %
       String30                6         2.5 %
       Array                   2         0.8 %
       Record                  2         0.8 %
                             ---       -------
                             242       100.0 %

    When there is an access path leading to the final operand (e.g. a record
    component), only the final data type on the access path is counted.


    4. Operand Locality:
    -------------------
                                  number    approximate
                                            percentage

       local variable              114        47.1 %
       global variable              22         9.1 %
       parameter                    45        18.6 %
          value                        23         9.5 %
          reference                    22         9.1 %
       function result               6         2.5 %
       constant                     55        22.7 %
                                   ---       -------
                                   242       100.0 %


    The program does not compute anything meaningful, but it is syntactically
    and semantically correct. All variables have a value assigned to them
    before they are used as a source operand.

    There has been no explicit effort to account for the effects of a
    cache, or to balance the use of long or short displacements for code or
    data.

 ***************************************************************************
*/

/* Compiler and system dependent definitions: */

#define TIME 1


int time() {
  return millis();
}


#ifndef TIME
#undef TIMES
#define TIMES
#endif
/* Use times(2) time function unless    */
/* explicitly defined otherwise         */

#ifdef MSC_CLOCK
#undef HZ
#undef TIMES
#include <time.h>
#define HZ    CLK_TCK
#endif
/* Use Microsoft C hi-res clock */

#ifdef TIMES
#include <sys/types.h>
#include <sys/times.h>
/* for "times" */
#endif

#define Mic_secs_Per_Second     1.0
/* Berkeley UNIX C returns process times in seconds/HZ */

#ifdef  NOSTRUCTASSIGN
#define structassign(d, s)      memcpy(&(d), &(s), sizeof(d))
#else
#define structassign(d, s)      d = s
#endif

#ifdef  NOENUM
#define Ident_1 0
#define Ident_2 1
#define Ident_3 2
#define Ident_4 3
#define Ident_5 4
typedef int   Enumeration;
#else
typedef       enum    {Ident_1, Ident_2, Ident_3, Ident_4, Ident_5}
Enumeration;
#endif
/* for boolean and enumeration types in Ada, Pascal */

/* General definitions: */

//#include <stdio.h>
/* for strcpy, strcmp */

#define Null 0
/* Value of a Null pointer */
#define true  1
#define false 0

typedef int     One_Thirty;
typedef int     One_Fifty;
typedef char    Capital_Letter;
typedef int     Boolean;
typedef char    Str_30 [31];
typedef int     Arr_1_Dim [50];
typedef int     Arr_2_Dim [50] [50];

typedef struct record
{
  struct record *Ptr_Comp;
  Enumeration    Discr;
  union {
    struct {
      Enumeration Enum_Comp;
      int         Int_Comp;
      char        Str_Comp [31];
    } var_1;
    struct {
      Enumeration E_Comp_2;
      char        Str_2_Comp [31];
    } var_2;
    struct {
      char        Ch_1_Comp;
      char        Ch_2_Comp;
    } var_3;
  } variant;
} Rec_Type, *Rec_Pointer;


/* Global Variables: */

Rec_Pointer     Ptr_Glob,
                Next_Ptr_Glob;
int             Int_Glob;
Boolean         Bool_Glob;
char            Ch_1_Glob,
                Ch_2_Glob;
int             Arr_1_Glob [50];
int             Arr_2_Glob [50] [50];

extern char     *malloc ();
Enumeration     Func_1 ();
/* forward declaration necessary since Enumeration may not simply be int */

#ifndef REG
Boolean Reg = false;
#define REG
/* REG becomes defined as empty */
/* i.e. no register variables   */
#else
Boolean Reg = true;
#endif

/* variables for time measurement: */

#ifdef TIMES
struct tms      time_info;
extern  int     times ();
/* see library function "times" */
#define Too_Small_Time (2*HZ)
/* Measurements should last at least about 2 seconds */
#endif
#ifdef TIME
//extern long     time();
/* see library function "time"  */
#define Too_Small_Time 2
/* Measurements should last at least 2 seconds */
#endif
#ifdef MSC_CLOCK
extern clock_t    clock();
#define Too_Small_Time (2*HZ)
#endif

long            Begin_Time,
                End_Time,
                User_Time;
float           Microseconds,
                Dhrystones_Per_Second;

/* end of variables for time measurement */


#include <malloc.h>
#include <string.h>
Enumeration Func_1 (Capital_Letter Ch_1_Par_Val, Capital_Letter  Ch_2_Par_Val);
Boolean Func_2 (Str_30 Str_1_Par_Ref, Str_30 Str_2_Par_Ref);
Boolean Func_3 (Enumeration Enum_Par_Val);
void Proc_1 (Rec_Pointer Ptr_Val_Par);
void Proc_2 (One_Fifty   *Int_Par_Ref);
void Proc_3 (Rec_Pointer *Ptr_Ref_Par);
void Proc_4 (void);
void Proc_5 (void);
void Proc_6 (Enumeration Enum_Val_Par, Enumeration * Enum_Ref_Par);
void Proc_7 (One_Fifty Int_1_Par_Val, One_Fifty Int_2_Par_Val, One_Fifty *Int_Par_Ref);
void Proc_8 (Arr_1_Dim Arr_1_Par_Ref, Arr_2_Dim Arr_2_Par_Ref, int Int_1_Par_Val, int Int_2_Par_Val);
void loop(void) {}
void setup(void)
//main ()
/*****/

/* main program, corresponds to procedures        */
/* Main and Proc_0 in the Ada version             */
{
  One_Fifty       Int_1_Loc;
  REG   One_Fifty       Int_2_Loc;
  One_Fifty       Int_3_Loc;
  REG   char            Ch_Index;
  Enumeration     Enum_Loc;
  Str_30          Str_1_Loc;
  Str_30          Str_2_Loc;
  REG   int             Run_Index;
  REG   int             Number_Of_Runs;

  /* Initializations */

  Next_Ptr_Glob = (Rec_Pointer) malloc (sizeof (Rec_Type));
  Ptr_Glob = (Rec_Pointer) malloc (sizeof (Rec_Type));

  Ptr_Glob->Ptr_Comp                    = Next_Ptr_Glob;
  Ptr_Glob->Discr                       = Ident_1;
  Ptr_Glob->variant.var_1.Enum_Comp     = Ident_3;
  Ptr_Glob->variant.var_1.Int_Comp      = 40;

  strcpy (Ptr_Glob->variant.var_1.Str_Comp,
          "DHRYSTONE PROGRAM, SOME STRING");
  strcpy (Str_1_Loc, "DHRYSTONE PROGRAM, 1'ST STRING");

  Arr_2_Glob [8][7] = 10;
  /* Was missing in published program. Without this statement,    */
  /* Arr_2_Glob [8][7] would have an undefined value.             */
  /* Warning: With 16-Bit processors and Number_Of_Runs > 32000,  */
  /* overflow may occur for this array element.                   */
  delay(1000);
  //  while(!Serial){};
  Serial.printf ("\n");
  Serial.printf ("Dhrystone Benchmark, Version 2.1 (Language: C)\n");
  Serial.printf ("\n");
  if (Reg)
  {
    Serial.printf ("Program compiled with 'register' attribute\n");
    Serial.printf ("\n");
  }
  else
  {
    Serial.printf ("Program compiled without 'register' attribute\n");
    Serial.printf ("\n");
  }
  Serial.printf ("Please give the number of runs through the benchmark: ");
  {
    int n = 150000;
    //scanf ("%d", &n);

    Number_Of_Runs = n;
  }
  Serial.printf ("\n");

  Serial.printf ("Execution starts, %d runs through Dhrystone\n", Number_Of_Runs);

  /***************/
  /* Start timer */
  /***************/

#ifdef TIMES
  times (&time_info);
  Begin_Time = (long) time_info.tms_utime;
#endif
#ifdef TIME
  //Begin_Time = time ( (long *) 0);
  Begin_Time = time ();
#endif
#ifdef MSC_CLOCK
  Begin_Time = clock();
#endif

  for (Run_Index = 1; Run_Index <= Number_Of_Runs; ++Run_Index)
  {

    Proc_5();
    Proc_4();
    /* Ch_1_Glob == 'A', Ch_2_Glob == 'B', Bool_Glob == true */
    Int_1_Loc = 2;
    Int_2_Loc = 3;
    strcpy (Str_2_Loc, "DHRYSTONE PROGRAM, 2'ND STRING");
    Enum_Loc = Ident_2;
    Bool_Glob = ! Func_2 (Str_1_Loc, Str_2_Loc);
    /* Bool_Glob == 1 */
    while (Int_1_Loc < Int_2_Loc)  /* loop body executed once */
    {
      Int_3_Loc = 5 * Int_1_Loc - Int_2_Loc;
      /* Int_3_Loc == 7 */
      Proc_7 (Int_1_Loc, Int_2_Loc, &Int_3_Loc);
      /* Int_3_Loc == 7 */
      Int_1_Loc += 1;
    } /* while */
    /* Int_1_Loc == 3, Int_2_Loc == 3, Int_3_Loc == 7 */
    Proc_8 (Arr_1_Glob, Arr_2_Glob, Int_1_Loc, Int_3_Loc);
    /* Int_Glob == 5 */
    Proc_1 (Ptr_Glob);
    for (Ch_Index = 'A'; Ch_Index <= Ch_2_Glob; ++Ch_Index)
      /* loop body executed twice */
    {
      if (Enum_Loc == Func_1 (Ch_Index, 'C'))
        /* then, not executed */
      {
        Proc_6 (Ident_1, &Enum_Loc);
        strcpy (Str_2_Loc, "DHRYSTONE PROGRAM, 3'RD STRING");
        Int_2_Loc = Run_Index;
        Int_Glob = Run_Index;
      }
    }
    /* Int_1_Loc == 3, Int_2_Loc == 3, Int_3_Loc == 7 */
    Int_2_Loc = Int_2_Loc * Int_1_Loc;
    Int_1_Loc = Int_2_Loc / Int_3_Loc;
    Int_2_Loc = 7 * (Int_2_Loc - Int_3_Loc) - Int_1_Loc;
    /* Int_1_Loc == 1, Int_2_Loc == 13, Int_3_Loc == 7 */
    Proc_2 (&Int_1_Loc);
    /* Int_1_Loc == 5 */

  } /* loop "for Run_Index" */

  /**************/
  /* Stop timer */
  /**************/

#ifdef TIMES
  times (&time_info);
  End_Time = (long) time_info.tms_utime;
#endif
#ifdef TIME
  End_Time = time ();
#endif
#ifdef MSC_CLOCK
  End_Time = clock();
#endif

  Serial.printf ("Execution ends\n");
  Serial.printf ("\n");
  Serial.printf ("Final values of the variables used in the benchmark:\n");
  Serial.printf ("\n");
  Serial.printf ("Int_Glob:            %d\n", Int_Glob);
  Serial.printf ("        should be:   %d\n", 5);
  Serial.printf ("Bool_Glob:           %d\n", Bool_Glob);
  Serial.printf ("        should be:   %d\n", 1);
  Serial.printf ("Ch_1_Glob:           %c\n", Ch_1_Glob);
  Serial.printf ("        should be:   %c\n", 'A');
  Serial.printf ("Ch_2_Glob:           %c\n", Ch_2_Glob);
  Serial.printf ("        should be:   %c\n", 'B');
  Serial.printf ("Arr_1_Glob[8]:       %d\n", Arr_1_Glob[8]);
  Serial.printf ("        should be:   %d\n", 7);
  Serial.printf ("Arr_2_Glob[8][7]:    %d\n", Arr_2_Glob[8][7]);
  Serial.printf ("        should be:   Number_Of_Runs + 10\n");
  Serial.printf ("Ptr_Glob->\n");
  Serial.printf ("  Ptr_Comp:          %d\n", (int) Ptr_Glob->Ptr_Comp);
  Serial.printf ("        should be:   (implementation-dependent)\n");
  Serial.printf ("  Discr:             %d\n", Ptr_Glob->Discr);
  Serial.printf ("        should be:   %d\n", 0);
  Serial.printf ("  Enum_Comp:         %d\n", Ptr_Glob->variant.var_1.Enum_Comp);
  Serial.printf ("        should be:   %d\n", 2);
  Serial.printf ("  Int_Comp:          %d\n", Ptr_Glob->variant.var_1.Int_Comp);
  Serial.printf ("        should be:   %d\n", 17);
  Serial.printf ("  Str_Comp:          %s\n", Ptr_Glob->variant.var_1.Str_Comp);
  Serial.printf ("        should be:   DHRYSTONE PROGRAM, SOME STRING\n");
  Serial.printf ("Next_Ptr_Glob->\n");
  Serial.printf ("  Ptr_Comp:          %d\n", (int) Next_Ptr_Glob->Ptr_Comp);
  Serial.printf ("        should be:   (implementation-dependent), same as above\n");
  Serial.printf ("  Discr:             %d\n", Next_Ptr_Glob->Discr);
  Serial.printf ("        should be:   %d\n", 0);
  Serial.printf ("  Enum_Comp:         %d\n", Next_Ptr_Glob->variant.var_1.Enum_Comp);
  Serial.printf ("        should be:   %d\n", 1);
  Serial.printf ("  Int_Comp:          %d\n", Next_Ptr_Glob->variant.var_1.Int_Comp);
  Serial.printf ("        should be:   %d\n", 18);
  Serial.printf ("  Str_Comp:          %s\n",
                 Next_Ptr_Glob->variant.var_1.Str_Comp);
  Serial.printf ("        should be:   DHRYSTONE PROGRAM, SOME STRING\n");
  Serial.printf ("Int_1_Loc:           %d\n", Int_1_Loc);
  Serial.printf ("        should be:   %d\n", 5);
  Serial.printf ("Int_2_Loc:           %d\n", Int_2_Loc);
  Serial.printf ("        should be:   %d\n", 13);
  Serial.printf ("Int_3_Loc:           %d\n", Int_3_Loc);
  Serial.printf ("        should be:   %d\n", 7);
  Serial.printf ("Enum_Loc:            %d\n", Enum_Loc);
  Serial.printf ("        should be:   %d\n", 1);
  Serial.printf ("Str_1_Loc:           %s\n", Str_1_Loc);
  Serial.printf ("        should be:   DHRYSTONE PROGRAM, 1'ST STRING\n");
  Serial.printf ("Str_2_Loc:           %s\n", Str_2_Loc);
  Serial.printf ("        should be:   DHRYSTONE PROGRAM, 2'ND STRING\n");
  Serial.printf ("\n");

  User_Time = End_Time - Begin_Time;

  if (User_Time < Too_Small_Time)
  {
    Serial.printf ("Measured time too small to obtain meaningful results\n");
    Serial.printf ("Please increase number of runs\n");
    Serial.printf ("\n");
  }
  else
  {
#ifdef TIME
    Microseconds = (float) User_Time
                   / (float) Number_Of_Runs;
    Dhrystones_Per_Second = (float) Number_Of_Runs * 1000 / (float) (User_Time);
#else
    Microseconds = (float) User_Time * Mic_secs_Per_Second
                   / ((float) HZ * ((float) Number_Of_Runs));
    Dhrystones_Per_Second = ((float) HZ * (float) Number_Of_Runs)
                            / (float) User_Time;
#endif

    Serial.printf ("Microseconds for one run through Dhrystone: ");
    //Serial.printf ("%6.12f \n", Microseconds);
    Serial.println(Microseconds,12);
    Serial.printf ("Dhrystones per Second:                      ");
    //Serial.printf ("%6.12f \n", Dhrystones_Per_Second);
    Serial.println(Dhrystones_Per_Second,12);
    Serial.printf ("\n");

  }

}

void Proc_1 (Rec_Pointer Ptr_Val_Par)
//Proc_1 (Ptr_Val_Par)
/******************/

//REG Rec_Pointer Ptr_Val_Par;
/* executed once */
{
  REG Rec_Pointer Next_Record = Ptr_Val_Par->Ptr_Comp;
  /* == Ptr_Glob_Next */
  /* Local variable, initialized with Ptr_Val_Par->Ptr_Comp,    */
  /* corresponds to "rename" in Ada, "with" in Pascal           */

  structassign (*Ptr_Val_Par->Ptr_Comp, *Ptr_Glob);
  Ptr_Val_Par->variant.var_1.Int_Comp = 5;
  Next_Record->variant.var_1.Int_Comp
    = Ptr_Val_Par->variant.var_1.Int_Comp;
  Next_Record->Ptr_Comp = Ptr_Val_Par->Ptr_Comp;

  Proc_3 (&Next_Record->Ptr_Comp);
  /* Ptr_Val_Par->Ptr_Comp->Ptr_Comp
                      == Ptr_Glob->Ptr_Comp */

  if (Next_Record->Discr == Ident_1)
    /* then, executed */
  {
    Next_Record->variant.var_1.Int_Comp = 6;
    Proc_6 (Ptr_Val_Par->variant.var_1.Enum_Comp,
            &Next_Record->variant.var_1.Enum_Comp);
    Next_Record->Ptr_Comp = Ptr_Glob->Ptr_Comp;
    Proc_7 (Next_Record->variant.var_1.Int_Comp, 10,
            &Next_Record->variant.var_1.Int_Comp);
  }
  else /* not executed */
    structassign (*Ptr_Val_Par, *Ptr_Val_Par->Ptr_Comp);
} /* Proc_1 */

void Proc_2 (One_Fifty   *Int_Par_Ref)
//Proc_2 (Int_Par_Ref)
/******************/
/* executed once */
/* *Int_Par_Ref == 1, becomes 4 */

//One_Fifty   *Int_Par_Ref;
{
  One_Fifty  Int_Loc;
  Enumeration   Enum_Loc;

  Int_Loc = *Int_Par_Ref + 10;
  do /* executed once */
    if (Ch_1_Glob == 'A')
      /* then, executed */
    {
      Int_Loc -= 1;
      *Int_Par_Ref = Int_Loc - Int_Glob;
      Enum_Loc = Ident_1;
    } /* if */
  while (Enum_Loc != Ident_1); /* true */
} /* Proc_2 */

void Proc_3 (Rec_Pointer *Ptr_Ref_Par)
//Proc_3 (Ptr_Ref_Par)
/******************/
/* executed once */
/* Ptr_Ref_Par becomes Ptr_Glob */

//Rec_Pointer *Ptr_Ref_Par;

{
  if (Ptr_Glob != Null)
    /* then, executed */
    *Ptr_Ref_Par = Ptr_Glob->Ptr_Comp;
  Proc_7 (10, Int_Glob, &Ptr_Glob->variant.var_1.Int_Comp);
} /* Proc_3 */

void Proc_4 (void)
//Proc_4 () /* without parameters */
/*******/
/* executed once */
{
  Boolean Bool_Loc;

  Bool_Loc = Ch_1_Glob == 'A';
  Bool_Glob = Bool_Loc | Bool_Glob;
  Ch_2_Glob = 'B';
} /* Proc_4 */

void Proc_5 (void)
//Proc_5 () /* without parameters */
/*******/
/* executed once */
{
  Ch_1_Glob = 'A';
  Bool_Glob = false;
} /* Proc_5 */


/* Procedure for the assignment of structures,          */
/* if the C compiler doesn't support this feature       */
#ifdef  NOSTRUCTASSIGN
memcpy (d, s, l)
register char   *d;
register char   *s;
register int    l;
{
  while (l--) *d++ = *s++;
}
#endif

#ifndef REG
#define REG
/* REG becomes defined as empty */
/* i.e. no register variables   */
#endif

extern  int     Int_Glob;
extern  char    Ch_1_Glob;

void Proc_6 (Enumeration Enum_Val_Par, Enumeration * Enum_Ref_Par)
//Proc_6 (Enum_Val_Par, Enum_Ref_Par)
/*********************************/
/* executed once */
/* Enum_Val_Par == Ident_3, Enum_Ref_Par becomes Ident_2 */

//Enumeration  Enum_Val_Par;
//Enumeration *Enum_Ref_Par;
{
  *Enum_Ref_Par = Enum_Val_Par;
  if (! Func_3 (Enum_Val_Par))
    /* then, not executed */
    *Enum_Ref_Par = Ident_4;
  switch (Enum_Val_Par)
  {
    case Ident_1:
      *Enum_Ref_Par = Ident_1;
      break;
    case Ident_2:
      if (Int_Glob > 100)
        /* then */
        *Enum_Ref_Par = Ident_1;
      else *Enum_Ref_Par = Ident_4;
      break;
    case Ident_3: /* executed */
      *Enum_Ref_Par = Ident_2;
      break;
    case Ident_4: break;
    case Ident_5:
      *Enum_Ref_Par = Ident_3;
      break;
  } /* switch */
} /* Proc_6 */

void Proc_7 (One_Fifty Int_1_Par_Val, One_Fifty Int_2_Par_Val, One_Fifty *Int_Par_Ref)
//Proc_7 (Int_1_Par_Val, Int_2_Par_Val, Int_Par_Ref)
/**********************************************/
/* executed three times                                      */
/* first call:      Int_1_Par_Val == 2, Int_2_Par_Val == 3,  */
/*                  Int_Par_Ref becomes 7                    */
/* second call:     Int_1_Par_Val == 10, Int_2_Par_Val == 5, */
/*                  Int_Par_Ref becomes 17                   */
/* third call:      Int_1_Par_Val == 6, Int_2_Par_Val == 10, */
/*                  Int_Par_Ref becomes 18                   */
//One_Fifty       Int_1_Par_Val;
//One_Fifty       Int_2_Par_Val;
//One_Fifty      *Int_Par_Ref;
{
  One_Fifty Int_Loc;

  Int_Loc = Int_1_Par_Val + 2;
  *Int_Par_Ref = Int_2_Par_Val + Int_Loc;
} /* Proc_7 */

void Proc_8 (Arr_1_Dim Arr_1_Par_Ref, Arr_2_Dim Arr_2_Par_Ref, int Int_1_Par_Val, int Int_2_Par_Val)
//Proc_8 (Arr_1_Par_Ref, Arr_2_Par_Ref, Int_1_Par_Val, Int_2_Par_Val)
/*********************************************************************/
/* executed once      */
/* Int_Par_Val_1 == 3 */
/* Int_Par_Val_2 == 7 */
//Arr_1_Dim       Arr_1_Par_Ref;
//Arr_2_Dim       Arr_2_Par_Ref;
//int             Int_1_Par_Val;
//int             Int_2_Par_Val;
{
  REG One_Fifty Int_Index;
  REG One_Fifty Int_Loc;

  Int_Loc = Int_1_Par_Val + 5;
  Arr_1_Par_Ref [Int_Loc] = Int_2_Par_Val;
  Arr_1_Par_Ref [Int_Loc + 1] = Arr_1_Par_Ref [Int_Loc];
  Arr_1_Par_Ref [Int_Loc + 30] = Int_Loc;
  for (Int_Index = Int_Loc; Int_Index <= Int_Loc + 1; ++Int_Index)
    Arr_2_Par_Ref [Int_Loc] [Int_Index] = Int_Loc;
  Arr_2_Par_Ref [Int_Loc] [Int_Loc - 1] += 1;
  Arr_2_Par_Ref [Int_Loc + 20] [Int_Loc] = Arr_1_Par_Ref [Int_Loc];
  Int_Glob = 5;
} /* Proc_8 */

Enumeration Func_1 (Capital_Letter Ch_1_Par_Val, Capital_Letter  Ch_2_Par_Val)
//Enumeration Func_1 (Ch_1_Par_Val, Ch_2_Par_Val)
/*************************************************/
/* executed three times                                         */
/* first call:      Ch_1_Par_Val == 'H', Ch_2_Par_Val == 'R'    */
/* second call:     Ch_1_Par_Val == 'A', Ch_2_Par_Val == 'C'    */
/* third call:      Ch_1_Par_Val == 'B', Ch_2_Par_Val == 'C'    */

//Capital_Letter   Ch_1_Par_Val;
//Capital_Letter   Ch_2_Par_Val;
{
  Capital_Letter        Ch_1_Loc;
  Capital_Letter        Ch_2_Loc;

  Ch_1_Loc = Ch_1_Par_Val;
  Ch_2_Loc = Ch_1_Loc;
  if (Ch_2_Loc != Ch_2_Par_Val)
    /* then, executed */
    return (Ident_1);
  else  /* not executed */
  {
    Ch_1_Glob = Ch_1_Loc;
    return (Ident_2);
  }
} /* Func_1 */


Boolean Func_2 (Str_30 Str_1_Par_Ref, Str_30 Str_2_Par_Ref)
//Boolean Func_2 (Str_1_Par_Ref, Str_2_Par_Ref)
/*************************************************/
/* executed once */
/* Str_1_Par_Ref == "DHRYSTONE PROGRAM, 1'ST STRING" */
/* Str_2_Par_Ref == "DHRYSTONE PROGRAM, 2'ND STRING" */

//Str_30  Str_1_Par_Ref;
//Str_30  Str_2_Par_Ref;
{
  REG One_Thirty        Int_Loc;
  Capital_Letter    Ch_Loc;

  Int_Loc = 2;
  while (Int_Loc <= 2) /* loop body executed once */
    if (Func_1 (Str_1_Par_Ref[Int_Loc],
                Str_2_Par_Ref[Int_Loc + 1]) == Ident_1)
      /* then, executed */
    {
      Ch_Loc = 'A';
      Int_Loc += 1;
    } /* if, while */
  if (Ch_Loc >= 'W' && Ch_Loc < 'Z')
    /* then, not executed */
    Int_Loc = 7;
  if (Ch_Loc == 'R')
    /* then, not executed */
    return (true);
  else /* executed */
  {



    if (strcmp (Str_1_Par_Ref, Str_2_Par_Ref) > 0)
      /* then, not executed */
    {
      Int_Loc += 7;
      Int_Glob = Int_Loc;
      return (true);
    }
    else /* executed */
      return (false);



  } /* if Ch_Loc */
} /* Func_2 */

Boolean Func_3 (Enumeration Enum_Par_Val)
//Boolean Func_3 (Enum_Par_Val)
/***************************/
/* executed once        */
/* Enum_Par_Val == Ident_3 */
//Enumeration Enum_Par_Val;
{
  Enumeration Enum_Loc;

  Enum_Loc = Enum_Par_Val;
  if (Enum_Loc == Ident_3)
    /* then, executed */
    return (true);
  else /* not executed */
    return (false);
} /* Func_3 */

**Some people have tried in the past to make me believe that, without success ;)
***Dont' use its results for comparison with other computers. It is not seperated into different compilation-units (see Info in the comments)

https://en.wikipedia.org/wiki/Dhrystone
 
Last edited:
With the GNU Arm Embedded Toolchain, Version 7-2018-q2-update Released: June 27, 2018 it is, of course, faster: 1136363 Dhrystones
 
(And I noticed that "Huston, we have a problem!!" optimizing for size has a serious problem (Crash?). Must be something in the core.. (USB?) but I'm too lazy to look for the reason.)

printf of float/double doesn't work for "smallest code" optimize. Serial.println(Dhrystones_Per_Second); will reveal the result.
 
Here it is :)
A HEX-File would run on a specific Teensy model only.

That was intended - anyone running it would have the same binary to show results - only change would be the clock/crystal and cycles/sec - with a couple of fixed OSC0_CR to see if it makes a difference or points to a common DHRYSTONE number without a GPS/PPS reference. Though the source is welcome, maybe I'll do that.

<edit>: Of course a quick test points out that the "/sec " part of the number is all relative to MCU clock :(
 
Last edited:
that's a mostly hardware independend program which does not know about different time-sources (hey, 1984!)
But it can be edited... an external clock would be best.
But I suspect the numbers are too small to reflect changes. Perhaps increase the 150000 runs by factor 10 or 100? 1000?

edit: i think an other, simpler and specialized program for your intended use would be better. way off-topic here...
 
FWIW, here are coremarkish numbers for T3.6 with Faster optimization. The 240mhz and 256 mhz were done today, the other data points are from 2016.
cm.pngcma.png

from earlier coremark post
 
Last edited:
Status
Not open for further replies.
Back
Top