Teensy 2.0++: Best way to measure code execution time in C?

Status
Not open for further replies.
What's the best way to measure how long it takes for the Teensy 2.0++ to execute a portion of C code? There seem to be quite a few useful timer libraries available for C++, but I haven't been able to find as many methods in C. Here are the 4 possibilities I've seen so far:

  1. PaulStoffregen mentions in this post (https://forum.pjrc.com/threads/61561-Teensy-4-Global-vs-local-variables-speed-of-execution?highlight=execution) that you can use the ARM_DWT_CYCCNT cycle counter on a 32-bit ARM processor to track clock cycles. Is there an equivalent for the 8-bit Atmel AT90USB1286?
  2. Most codes in C++ take advantage of the elapsedMillis/elapsedMicros and millis/micros libraries mentioned here (https://www.pjrc.com/teensy/td_timing.html). Is there a C library equivalent? (<Time.h> for AVR processors is missing the standard clock_t type that is usually used for this purpose (https://www.nongnu.org/avr-libc/user-manual/group__avr__time.html))
  3. The usb_serial example includes a tx_benchmark file which sets timer0 to overflow every 4 ms. The code:
    Code:
    // configure timer0 to overflow every 4 ms, prescale=256, top=250
    	// 250 * 256 / 16 MHz = 4 ms
    	TIMSK0 = 0;
    	TCCR0A = (1<<WGM01)|(1<<WGM00);
    	OCR0A = 250;
    	TCCR0B = (1<<WGM02)|(1<<CS02);
    
    ...
    
    // wait for a 4 ms timer0 period to begin
    		CLEAR_TIMER0_OVERFLOW();
    		while (!IS_TIMER0_OVERFLOW()) /* wait */ ;
    		CLEAR_TIMER0_OVERFLOW();
    		count=0;
    This seems to be the most likely solution, but I don't follow everything that is going on in this code. Is there a reference resource that would describe the system elements being manipulated here in more detail?
  4. The most accurate method seems to be using an oscilloscope to measure an active pin on the board, but I don't have access to an oscilloscope and don't need that level of accuracy.

Any tips as to which of the above four would be the best method to pursue, or if there is a better method that I am missing? Thanks for any input!

As an aside, one thing that I'm looking to test is whether my code handles the char type just as fast as the uint_fast8_t type as an input into a switch control flow. Theoretically they should be the same speed, but wanted to verify the implementation.
 
https://en.reddit.com/r/arduino/comments/1q1chr/using_timer1_to_count_clock_cycles/

Timer0 interrupts every millisecond and may cause jitter in your timer1 measurement. Timer0 is used for millis() and micros()

you can also look at the assembler code the compiler generates (.lst).

a simple TIMER1 cycle counting sketch
Code:
// cycle counter using timer 1

volatile unsigned int t1, t2;

void setup()
{
  unsigned long result;
  unsigned int i, mint, maxt;


  Serial.begin(9600); while (!Serial);

  TCCR1A = 0;// set registers to 0
  TCCR1B = 0;
  TCCR1C = 0;

  TCNT1 = 0;
  TCCR1B = 1;// start timer 1
  TCCR1B = 0;    //stop the timer
  t2 = TCNT1;  //store passed ticks
  Serial.print("res ");
  Serial.println(t2);

  TCNT1 = 0;
  TCCR1B = 1;// start timer 1
  asm volatile("nop");
  asm volatile("nop");
  asm volatile("nop");
  asm volatile("nop");
  TCCR1B = 0;    //stop the timer
  t2 = TCNT1;  //store passed ticks
  Serial.print("4 nops  ");
  Serial.println(t2);


  TCNT1 = 0;
  TCCR1B = 1;// start timer 1
  result = millis();
  TCCR1B = 0;    //stop the timer
  t2 = TCNT1;  //store passed ticks
  Serial.print("millis ");
  Serial.println(t2);

  analogRead(A0);
  analogRead(A0);
  TCNT1 = 0;
  TCCR1B = 1;// start timer 1
  i = analogRead(A0);
  TCCR1B = 0;    //stop the timer
  t2 = TCNT1;  //store passed ticks
  Serial.print("ADC ");
  Serial.println(t2);
}

void loop() {}

Code:
  TCNT1 = 0;
 25e:   10 92 85 00     sts 0x0085, r1  ; 0x800085 <__TEXT_REGION_LENGTH__+0x7e0085>
 262:   10 92 84 00     sts 0x0084, r1  ; 0x800084 <__TEXT_REGION_LENGTH__+0x7e0084>
  TCCR1B = 1;// start timer 1
 266:   c0 93 81 00     sts 0x0081, r28 ; 0x800081 <__TEXT_REGION_LENGTH__+0x7e0081>
    ...
  asm volatile("nop");
  asm volatile("nop");
  asm volatile("nop");
  asm volatile("nop");
  TCCR1B = 0;    //stop the timer
 272:   10 92 81 00     sts 0x0081, r1  ; 0x800081 <__TEXT_REGION_LENGTH__+0x7e0081>
  t2 = TCNT1;  //store passed ticks
 276:   80 91 84 00     lds r24, 0x0084 ; 0x800084 <__TEXT_REGION_LENGTH__+0x7e0084>
 27a:   90 91 85 00     lds r25, 0x0085 ; 0x800085 <__TEXT_REGION_LENGTH__+0x7e0085>
 
Last edited:
If using something like millis() is not sufficient for your needs, I mostly will use option 4: hardware (Logic Analyzer). I put something like:
digitalWriteFast(1, HIGH) at start of the code and digitalWriteFast(1, LOW) at the other end and then look at then capture how long it takes.
Obviously you can use how many pins you want. I showed a hard coded 1, as with T3.x this reduces down to one instruction... I believe it is the same

In the days before I had a Logic Analyzer, I might cobble something up using another Microprocessor (preferably Teensy), that maybe runs a simple sketch like:

Code:
#define TEST_PIN 1
void setup() {
    while (!Serial && millis() < 5000) ; // only on boards like Teensy or Leonardo or... that main processor does the USB
    Serial.begin(115200);
    pinMode(TEST_PIN, INPUT_PULLDOWN);
}

void loop()  {
    while (!digitalReadFast(TEST_PIN)) ; // wait for pin to go high
    uint32_t start_time = micros();   // could be millis if longer time needed
    while(digitalReadFast(TEST_PIN) ;
    uint32_t delta_time = millis() - start_time;
    Serial.println(delta_time, DEC):
}
Then hook this up to the pin of your main sketch you use to bracket your code, as well as a common ground wire.

If your setup for simple sketch needs to be C without things like Millis() you could change loop above to be something like:
Code:
void loop()  {
    while (!digitalReadFast(TEST_PIN)) ; // wait for pin to go high
    uint32_t loop_count = 0;
    while(digitalReadFast(TEST_PIN) loop_count++;
    Serial.println(loop_count, DEC):
}
This does nothing to convert loop count into actual time, if important you can, but if you just want to know if changing something speeds things up, then probably not needed.
 
Status
Not open for further replies.
Back
Top