Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 6 of 6

Thread: Speed of digitalRead and digitalWrite with Teensy3.0

  1. #1
    Junior Member
    Join Date
    Nov 2013
    Posts
    5

    Speed of digitalRead and digitalWrite with Teensy3.0

    Hi,

    I was wondering if someone already measured the duration of digitalRead and digitalWrite. In my Arduino Project both commands resulted in a significant delay and it was necessary to directly access the ports. Is this delay also present using Teensy 3.0?

    Best Nils

  2. #2
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    27,102
    Yes, but it's much less delay.

    Teensy 3.0 has 2 other functions, digitalReadFast() and digitalWriteFast() which you can use for the fastest possible performance. They are extremely quick.

    But there are a couple small caveats to these fast versions (the reason why the normal ones aren't made fast). The main limitation is the fast ones only work with a constant for the pin number. The fast ones also skip some fancy stuff that's needed for perfect Arduino compatibility, like using digitalWrite() to control the pullup resistor on a pin configured for input mode.

    Here are a couple quick tests. First, the normal digitalWrite():

    Code:
    void setup() {
      pinMode(2, OUTPUT);
    }
    void loop() {
      while (1) {
        digitalWrite(2, HIGH);
        digitalWrite(2, LOW);
        digitalWrite(2, HIGH);
        digitalWrite(2, LOW);
        digitalWrite(2, HIGH);
        digitalWrite(2, LOW);
      }
    }
    Click image for larger version. 

Name:	scope_0.png 
Views:	1108 
Size:	25.9 KB 
ID:	1138
    (click for larger)

    As you can see from the 700 kHz waveform, each digitalWrite() is taking approx 0.71 us.

    Here's another test using digitalWriteFast():

    Code:
    void setup() {
      pinMode(2, OUTPUT);
    }
    void loop() {
      while (1) {
        digitalWriteFast(2, HIGH);
        digitalWriteFast(2, LOW);
        digitalWriteFast(2, HIGH);
        digitalWriteFast(2, LOW);
        digitalWriteFast(2, HIGH);
        digitalWriteFast(2, LOW);
      }
    }
    Click image for larger version. 

Name:	scope_1.png 
Views:	997 
Size:	25.7 KB 
ID:	1139
    (click for larger)

    This shows the extreme speed that's possible. A pair of digitalWriteFast() takes only 21 ns. It's so extremely fast that the rise and fall times of the digital output can be easily seen (the scope's horizontal scale is 50X faster than the one above). The was tested with a 2 inch wire and my scope probe using an ordinary 3 inch ground clip, so there's some overshoot. To make proper measurements at these high bandwidths, better probing with short wires and ground leads is needed. I just quickly hook my scope up to a test board I had laying on my desk for the sake of this message. Also, my scope is only a 200 MHz model, and this waveform is 48 MHz in the pulses, so this is pushing close to the limits of my equipment.

    Another feature obvious feature of this waveform is the dead time between each 3 pulses. Some of that time is the loop overhead, to branch back and execute the same code over again. But it also includes some compiler overhead, and other thing I'll mention in a moment. On ARM, writing to the pin involves a store instruction, which needs 2 registers loaded with constants. Some of that dead time is the compiler placing the constants into registers, in preparation for the 6 digitalWriteFast() lines. The compiler is pretty smart about loading registers in advance to optimize loops. Without understanding this, it might seem like digitalWriteFast() only takes 10.5 ns, but in fact there is overhead to set up the registers.

    The other factor at play here is a special hardware optimization in the Cortex-M4 chip for back-to-back bus operations. Normally a store instruction takes 2 cycles. But if your code uses multiple store (or load) instructions in a row, it uses a special bus burst mode where the 2nd, 3rd, etc only take a single cycle. In this waveform, we're seeing that effect. The first digitalWriteFast() actually took twice as long as the other 5, but the effect isn't visible since the line was already low.

    So digitalWriteFast() can give you extreme speed, and for some uses it can create tiny 10 ns wide pulses (which might be much too fast for some chips), but there is some overhead which the compiler will sometimes optimize to outside of loops.
    Last edited by PaulStoffregen; 11-27-2013 at 11:00 AM.

  3. #3
    Senior Member+ MichaelMeissner's Avatar
    Join Date
    Nov 2012
    Location
    Ayer Massachussetts
    Posts
    4,328
    Paul, have you considered making digitalWrite a macro that uses __builtin_constant_p to call digitalWriteFast automagically? Something like:

    Code:
    void digitalWrite(uint8_t pin, uint8_t val);
    static inline void digitalWriteFast(uint8_t pin, uint8_t val) __attribute__((always_inline, unused));
    static inline void digitalWriteFast(uint8_t pin, uint8_t val)
    {
      // stuff to do digitalWriteFast
    }
    
    // Convert digitalWrite into digitalWriteFast if the argument is constant
    // the parenthesis around digitalWrite ensures the function is called, and not the macro
    #define digitalWrite(PIN, VAL) (__builtin_constant_p (PIN) ? digitalWriteFast (PIN, VAL) : (digitalWrite) (PIN, VAL))
    You would need:

    Code:
    #undef digitalWrite
    before the definition.

    In looking at it, since digitalWriteFast already does the __builtin_constant_p test, you could potentially move the guts to digitalWrite, and then use:

    Code:
    #define digitalWriteFast(PIN, VAL) digitalWrite(PIN, VAL)
    This assumes that there is no additional processing that digitalWrite does for constant pins that isn't done for digitalWriteFast. Presumably the same would go for digitalRead/digitalReadFast.

    The __builtin_constant_p function is documented at: http://gcc.gnu.org/onlinedocs/gcc-4....Other-Builtins

  4. #4
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    27,102
    Yes, I did that on Teensy 2.0, planning to do it on 3.0 at some point.

    But on 3.0, even using __builtin_constant_p, more stuff is needed to emulate the AVR quirks and handle PWM pins, so it won't ever be as fast as using digitalWriteFast which skips those checks.

  5. #5
    Senior Member
    Join Date
    Jan 2014
    Location
    London, UK
    Posts
    122
    I notice that te Arduino IDE doesn't highlight the digitalWriteFast() or digitalReadFast() functions... Am I doing something wrong?

  6. #6
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    27,102
    Quote Originally Posted by bloodline View Post
    I notice that te Arduino IDE doesn't highlight the digitalWriteFast() or digitalReadFast() functions... Am I doing something wrong?
    Opps, those were never added to the keyword highlighting. It's not you.

    I'll add them on the next release.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •