Teensyduino access to counting cpu cycles

Status
Not open for further replies.

BobQ

Member
Is there a way I can count cpu cycles from teensyduino? I would like to measure the time interval between a rising edge on 1 pin to a rising edge on a different pin. Using attachInterrupt(pin, ISR, RISING) and micros() works well with my teensy 3.1 with reliable resolution to 1 usec but I'd like to do better. My application measures an interval from 0 to 100 msec once a second. Or maybe there's another way. I looked at FreqMeasure which I think counts cpu cycles but I don't know C and I can't tell what's going on inside. Thanks.
 
First you need to start the cycle counter. Just put this in setup
Code:
  ARM_DEMCR |= ARM_DEMCR_TRCENA;
  ARM_DWT_CTRL |= ARM_DWT_CTRL_CYCCNTENA;
Then you can access the cycle counter by:
Code:
cycles = ARM_DWT_CYCCNT;
 
Thanks. This works great. I expected to see overflows every 89.5 seconds based on 48 MHz cpu clock and 32 bits. But no overflows. The counter must reset when I call the function below. Don't know if this should move to tech support...

unsigned long gpsTest() {

unsigned long period = 0;
unsigned long firstEdge = 0;
unsigned long secondEdge = 0;
gps_flag = 0;
while (!gps_flag) { // wait for line to go high, start of timing interval
}
firstEdge = ARM_DWT_CYCCNT; // free running time in cycle counts
gps_flag = 0;
while (!gps_flag) { // wait for next edge
}
secondEdge = ARM_DWT_CYCCNT;
// subtract to get period
period = secondEdge - firstEdge;
return period;
}
 
Please wrap your code in code tags to make it easier to read.

What do you expect to happen to the cycle counter? It just counts upwards and your code calculates the difference between two counter states. You cannot detect counter overflows with that code.

Also, your gps_flag is set to zero before you wait for it to be high. However, I can see no mechanism that would set gps_flag high. If you have some interrupt handler that does this, please include it in the posted code and also include the declaration of gps_flag.
 
Last edited:
You will not see the overflow, because the subtraction "secondEdge-firstEdge" will also overflow, thereby generating the proper value.
When you check for secondEdge < firstEdge, you should see the overflow happening.
As long as the intervals occur less than 45 second, you should be OK.
 
I'm confused. Are you trying to detect when the cycle counter overflows? If so, why?

Normally, subtracting same-size unsigned integers gives the proper elapsed time, even in the presence of overflow, for the reasons kcp & christoph explained. There's normally no need to look for overflow.
 
Thanks all. I understand more now. I'm not worried about overflow or the occasional low readings. I remember a comment about that somewhere, just have to find it again.

I split out my first test into a separate program, shown below. I'm not sure what happens if the counter overflow happens between the first and second reading but the probability is very low. In this test, the readings are 10 msec apart so the probability of overflow in a single reading is 0.01 / (2^32 -1). I ran the program for about 3 hours and all I saw were occasional readings that were about 120 cycles low, which may be related to the flash wait states that stevech mentioned.

Here's the serial monitor for about 3 hours, printing only when the value of "test" is unusually large. Typical in my test is "test" (second column) = about -5 to -15. The input is at 100 Hz. The first column is a counter. The second is (actual cpu cycles) - (nominal cpu cycles). The third column is actual cpu cycles.

916 -122 479878
2094 -128 479872
4456 -130 479870
5638 -130 479870
8593 -129 479871
9182 -128 479872
10361 -131 479869
16269 -129 479871
18044 -125 479875
19227 -130 479870
22184 -130 479870
23367 -131 479869
24550 -129 479871
25733 -129 479871
28099 -130 479870
38152 -128 479872
39926 -128 479872
41700 -128 479872
43474 -127 479873
46430 -127 479873
49386 -130 479870
51159 -128 479872
68307 -128 479872
71860 -129 479871
76598 -129 479871
77783 -129 479871
80747 -130 479870
87264 -130 479870
92008 -131 479869
94381 -122 479878
100903 -129 479871



Code:
/* Test counting cpu cycles
Teensy 3.1, Arduino 1.61 and Teensyduino 1.21
*/
const int gpsPin = 2;
unsigned long gpsPeriod = 0;
volatile int gpsFlag; // set by gps interrupt
unsigned long longCount = 0;
long test;

void setup() {
  ARM_DEMCR |= ARM_DEMCR_TRCENA;  // info from teensy forum, access cycle counter
  ARM_DWT_CTRL |= ARM_DWT_CTRL_CYCCNTENA;
  pinMode(gpsPin, INPUT);
  Serial.begin(9600);
  attachInterrupt(gpsPin, gpsISR, RISING); // set flag high on rising edge
}


void loop() {
  gpsPeriod = gpsTest();
  longCount++;
  test = gpsPeriod - 480000; // 10 msec input, 48 MHz clock
  if (test < -50 || test > 50) { // test typically = -8 for this test setup
    Serial.print(longCount);
    Serial.print("  ");
    Serial.print(test);
    Serial.print("  ");
    Serial.println(gpsPeriod);
  }
  delay(100);
}

//-------------------------------------
unsigned long gpsTest() {
  unsigned long period = 0;
  unsigned long firstEdge = 0;
  unsigned long secondEdge = 0;
  gpsFlag = 0;
  while (!gpsFlag) { // wait for line to go high, start of timing interval
  }
  firstEdge = ARM_DWT_CYCCNT;  // free running time in cycle counts
  gpsFlag = 0;
  while (!gpsFlag) { // wait for next edge
  }
  secondEdge = ARM_DWT_CYCCNT;
  // subtract to get period
  period = secondEdge - firstEdge;
  return period;
}

//-------------------------------------
void gpsISR() {
  gpsFlag = 1;
}
 
try it this way:

Code:
#define CPU_RESET_CYCLECOUNTER    do { ARM_DEMCR |= ARM_DEMCR_TRCENA;          \
                                       ARM_DWT_CTRL |= ARM_DWT_CTRL_CYCCNTENA; \
                                       ARM_DWT_CYCCNT = 0; } while(0)
volatile int cycles;


void setup() {
  // put your setup code here, to run once:
  while (!Serial);
  delay(100);
}


void loop() {
  // put your main code here, to run repeatedly:
  //__disable_irq();
  CPU_RESET_CYCLECOUNTER;
  asm volatile ("nop");
  asm volatile ("nop");
  asm volatile ("nop");
  asm volatile ("nop");
  asm volatile ("nop");
  asm volatile ("nop");
  asm volatile ("nop");
  asm volatile ("nop");
  asm volatile ("nop");
  asm volatile ("nop");
  cycles = ARM_DWT_CYCCNT;
  //__enable_irq();
  Serial.print("cycles: ");
  Serial.println(cycles);
  delay(1000);
}

Should get 10 cycles...
 
The cycle counts are NOT locked in - even with the asm/'nop' code - the compiler choices and resultant cahce & execution path and surrounding code can give varied (but consistent) results on small tests.

@Stevech
In 30+ net runs I also saw the first IRQ enabled loop twice show one iteration with a count of 251 or higher - but it wasn't the first iteration. So indeed if an interrupt happens during the measuring "it seems" you get those cycles counted. The disable_IRQ showed some higher (11-14) counts, but generally resulted in 10 where 10 was expected.

I ran the posted code in a for() and get 10 on the first iteration and 11 on subsequent iterations.

When I clone that loop the first repeats 10(1st) then 11's and then with uncommented disable_irq code the first iteration hits 12 and 13 on subsequent iterations.

Funny thing, I put two "nop" after the disable_irq before the RESET and those times all went to '10'. Then I cloned that loop to differentiate it and now with and without the preceding 'nop' the counts are 10, and the first with interrupts on is now 13 not 11 - then the rest go to 10.

I recompiled at 96MHz - no Optimize and the numbers never hit 10 but are solid 14 and 12 and 11 for those three cases.

Then I put delay(10) after the Serial.print's and I get 17 and 11(1st) then 10's and 10's. Going to 72MHz OPT I get higher numbers, 72MHz no-OPT I get 17 and 10 and 10. For the last set I took out the delay(10) and get 12 and 10 and 10.

Three groups in order are: commented disable_irq and uncommented disable_irq and uncommented disable_irq with two pre-RESET 'nop'

I went back to 96MHz Optimized and got the 12 and 10 and 10. I then commented out the third loop and the prior behavior returned, so it is consistent for a given compilation process, but it can be varied by the surrounding code - and not always as expected.
 
Last edited:
i ran into a weird issue when trying this (but i'm probably missing something obvious) -- the sketch below just prints the cycle counter every now and then. as far as i can tell that's basically what kpc suggested above. it works fine when i upload the sketch from the IDE, in which case the counter increments. it stops working however after a power cycle (and keeping the serial monitor open; or printing to a display), in which case the counter refuses to increment ?


Code:
volatile uint32_t TIME_STAMP;
uint32_t _wait;

void FASTRUN clk_ISR() 
{  
   TIME_STAMP = ARM_DWT_CYCCNT;    
}

void setup()
{ 
   pinMode(0, INPUT);
   ARM_DEMCR |= ARM_DEMCR_TRCENA;  
   ARM_DWT_CTRL |= ARM_DWT_CTRL_CYCCNTENA;
   attachInterrupt(0, clk_ISR, FALLING); 
   
}



void loop() 
{
  
  while (1) 
  {
    
        if(millis() - _wait > 100) 
        { 
            _wait = millis();
            Serial.println(TIME_STAMP);
        }
  }
}
 
I had all zeros and no activity - nothing running on my pin 0?

I ran the int with a timer and it looks right to me - IDE load or re-power with this code?

Code:
#include <TimerOne.h>
volatile uint32_t TIME_STAMP = 0;
uint32_t _wait;

void FASTRUN clk_ISR()
{
  if (0 == TIME_STAMP )
    TIME_STAMP = ARM_DWT_CYCCNT;
}

long unsigned MicrosVal = 330000;

#define qBlink() (digitalWriteFast(LED_BUILTIN, !digitalReadFast(LED_BUILTIN) ))

void setup() {
  digitalWriteFast(LED_BUILTIN, 0);
  pinMode(LED_BUILTIN, OUTPUT);
  Serial.begin(9600);  // USB
  qBlink();
  while (!Serial && millis() <= 3000) if (!(millis() % 200))  qBlink();
  Serial.print("Hello World! ... ");
  Serial.println(millis());
  digitalWriteFast(LED_BUILTIN, 1);

//  pinMode(0, INPUT);
  ARM_DEMCR |= ARM_DEMCR_TRCENA;
  ARM_DWT_CTRL |= ARM_DWT_CTRL_CYCCNTENA;
//  attachInterrupt(0, clk_ISR, FALLING);
  Timer1.initialize(MicrosVal);
  Timer1.attachInterrupt(clk_ISR);

}



void loop()
{
  if (millis() - _wait > 100)
  {
    if ( TIME_STAMP ) {
      Serial.println(TIME_STAMP);
      TIME_STAMP = 0;
    }
    _wait = millis();
    qBlink();
  }
}

Funner with this line having one less zero: long unsigned MicrosVal = 33000;
 
Last edited:
I had all zeros and no activity - nothing running on my pin 0?

sure, you'd need some activity on pin 0. .. but thanks for trying.


I ran the int with a timer and it looks right to me - IDE load or re-power with this code?

so are you suggesting i reset the variable? i fail to see any significant difference in the code, except for the additional "if" ...edit: ** well, and except for the delay(XXXX) in setup(). that was that, it seems....
 
Last edited:
Indeed YMMV, I'm living from a suitcase and have aT3.1, was late here but wanted to try some thing, I have a wire, but adding the timer it was easier and it worked. I was wondering about the volatility, wondered if the slow check/set on zero would prevent change during use in case that was the trouble. Without disabling interrupts.
And I did my usual setup and blink add to make sure that was not why you saw changing behavior.
 
Indeed YMMV, I'm living from a suitcase and have aT3.1, was late here but wanted to try some thing, I have a wire, but adding the timer it was easier and it worked. I was wondering about the volatility, wondered if the slow check/set on zero would prevent change during use in case that was the trouble. Without disabling interrupts.
And I did my usual setup and blink add to make sure that was not why you saw changing behavior.

oh, it didn't vary! sorry for being unclear. what i meant so say was: the significant difference consisted in putting some sort of delay into the setup() routine, ie before doing
Code:
  ARM_DEMCR |= ARM_DEMCR_TRCENA;
  ARM_DWT_CTRL |= ARM_DWT_CTRL_CYCCNTENA;
... thanks again!
 
Sorry, ARM_DWT_CYCCNT is one of many "unnecessary" features ARM removed from Cortex-M0+ to reduce the cost. It simply doesn't exist on Teensy LC.
 
ARM_DWT_CYCCNT always has value 0 on Teensy-LC. Can that be fixed? Or am I stupid.
You still have the systick timer / counter. The way it is set up, it counts down CPU cycles for 1 millisecond (downwards) and will then wrap. At 48MHz, it counts from 47999 down to 0. Use it like this: "uint32_t count = SYST_CVR;".

It's possible to extend it to a 32-bit counter, but that will be fairly slow and take 34 clock cycles.
 
Thanks, that works great. Here's a good hardware random number generator in a few lines of code that works on Teensy LC. It uses the built-in LED as a leaky capacitor. After it is turned on, it is set to an input. The voltage rapidly drops while it emits light, then more slowly, all above the 1->0 input threshold. After about 50us it finally reads as zero, but there's a lot of run-to-run variation. Assuming at least one bit of entropy per charge-discharge cycle, the RNG is good. It arose from a discussion on what would be cheapest cryptographically secure HWRNG. It works on ESP8266 (hackers found an undocumented built-in HWRNG) and PSOC and I'm sure others. The LED can be used for other things too.

Code:
uint32_t rotateRight(uint32_t value, int count){
  return (value >> count) | (value << (32 - count));
}

uint32_t led_rng(void){
  static uint32_t res=0U; // only initializes once
  register uint32_t loops;
  for(int i=0; i<32; i++){
    pinMode(LED_BUILTIN, OUTPUT);
    digitalWrite(LED_BUILTIN, HIGH);
    pinMode(LED_BUILTIN, INPUT);
    while(digitalRead(LED_BUILTIN)==HIGH);
    loops = SYST_CVR; // highest resolution timer on Teensy-LC
    res ^= rotateRight(loops,i);
  }
  return res;
}
 
void setup() {
  Serial.begin(115200);
}

String padTo8(String s)
{
  while(s.length() < 8) s = "0" + s;
  return s;
}

void loop() {
  for(int i=0; i<10;i++) Serial.print(padTo8(String(led_rng(),HEX)));
  Serial.println();
}
 
The data rate depends on the light level on the LED. You get a light sensor and a solid state physics lesson for free! With a bright flashlight the data almost stops.
 
The data rate depends on the light level on the LED. You get a light sensor and a solid state physics lesson for free! With a bright flashlight the data almost stops.

Clever! probably more crypto-secure than reading floating analog pins, but an adversary might be able to affect the RNG with a bright light ...

Note, new Teensy 3.5 and 3.6 have builtin hardware RNG (7.5 million random bits/sec)

Also, though slow, entropy can be generated by dueling clocks, see
https://code.google.com/archive/p/avr-hardware-random-number-generation/wikis/WikiAVRentropy.wiki

EDIT:
I confirmed the LED discharge sketch on my LC, and it worked on T3.0 and T3.6. But on T3.5, T3.1 and T3.2, it hangs because, floating input voltage on pin 13 never drops below 1.62 v, so pin never goes LOW. That response may vary from chip to chip?, or 5v-tolerant teensy's won't drop pin 13 voltage. Also, the R3 UNO's have an opamp in front of LED on pin 13, so the RNG won't work for those Arduino boards with the opamp LEDs. Also confirmed a bright light on the LED slows the response, and stops the LED discharge if light is bright and close to the LED.
 
Last edited:
Status
Not open for further replies.
Back
Top