Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 13 of 13

Thread: Sub Micro Second Pulses

  1. #1
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    1,084

    Sub micro second pulses

    I had a need for sub micro second square wave pulses, so I wrote this. It looks fine on a scope. Comments are appreciated.

    Code:
    // Routine to delay for specified number of nano seconds
    // NOTE:  minimum pulse width is ~700 nsec, accuracy is ~ -0/+40 ns
    // NOTE:  you can't trust this code:
    //        compiler or library changes will change timing overhead
    //        CPU speed will effect timing
    
    // Jon Zeeff  V1.1
    // Public Domain
    // Written for teensy 3.1
    
    #define LED_PIN 13
    
    void setup() {
      delay(1000);
      pinMode(LED_PIN, OUTPUT);
      Serial.println("hello");
    }
    
    void loop() {
    
      //Setup_Nano_Delay(4000000000);
      Setup_Nano_Delay(700);
    
      Serial.println("start");
      delay(10);  // allow start message to go out
      noInterrupts();
      digitalWriteFast(LED_PIN, 1);
      Nano_Delay();
      digitalWriteFast(LED_PIN, 0);
      interrupts();
      Serial.println("stop\n");
    
      delay(2000);
    }
    
    // delay for a given number of nano seconds
    // less sensitive to interrupts and DMA
    // max delay is 4 seconds
    
    constexpr double   CLOCK_RATE = 96.00000E6;     // MCU clock rate - measure it for best accuracy
    constexpr unsigned NANO_OVERHEAD = 470;         // overhead - adjust as needed
    constexpr unsigned NANO_JITTER = 18;            // adjusts for jitter prevention - leave at 18
    
    // prepare before, so less delay later
    static uint32_t nano_ticks;
    
    void Setup_Nano_Delay(uint32_t nanos)
    {
      // set up cycle counter
      ARM_DEMCR |= ARM_DEMCR_TRCENA;
      ARM_DWT_CTRL |= ARM_DWT_CTRL_CYCCNTENA;
    
      // improve teensy 3.1 clock accuracy
      OSC0_CR = 0x2;
    
      // we can't do less than this
      if (nanos < NANO_OVERHEAD)
         nanos = NANO_OVERHEAD;
       
      // how many cycles to wait
      nano_ticks = ((nanos - NANO_OVERHEAD) / (1.0E9 / CLOCK_RATE)) + .5;
      
      if (nano_ticks < NANO_JITTER)
         nano_ticks = NANO_JITTER;
              
    } // Setup_Nano_Delay()
    
    // Do the delay specified above.
    // You may want to disable interrupts before and after
    
    FASTRUN void Nano_Delay(void)
    {
      uint32_t start_time = ARM_DWT_CYCCNT;
      uint32_t loop_ticks = nano_ticks - NANO_JITTER;
    
      // loop until time is almost up
      while ((ARM_DWT_CYCCNT - start_time) < loop_ticks) {
         // could do other things here
      }
    
      if (NANO_JITTER) {   // compile time option
    
        register unsigned r;          // for debugging
        
        // delay for the remainder using single instructions
        switch (r = (nano_ticks - (ARM_DWT_CYCCNT - start_time))) {
          case 18: __asm__ volatile("nop" "\n\t");
          case 17: __asm__ volatile("nop" "\n\t");
          case 16: __asm__ volatile("nop" "\n\t");
          case 15: __asm__ volatile("nop" "\n\t");
          case 14: __asm__ volatile("nop" "\n\t");
          case 13: __asm__ volatile("nop" "\n\t");
          case 12: __asm__ volatile("nop" "\n\t");
          case 11: __asm__ volatile("nop" "\n\t");
          case 10: __asm__ volatile("nop" "\n\t");
          case 9: __asm__ volatile("nop" "\n\t");
          case 8: __asm__ volatile("nop" "\n\t");
          case 7: __asm__ volatile("nop" "\n\t");
          case 6: __asm__ volatile("nop" "\n\t");
          case 5: __asm__ volatile("nop" "\n\t");
          case 4: __asm__ volatile("nop" "\n\t");
          case 3: __asm__ volatile("nop" "\n\t");
          case 2: __asm__ volatile("nop" "\n\t");
          case 1: __asm__ volatile("nop" "\n\t");
          default:
               break;
        }  // switch()
      
      } // if
     
    }  // Nano_Delay()
    Last edited by jonr; 07-07-2015 at 07:02 PM.

  2. #2
    Junior Member
    Join Date
    Jun 2016
    Posts
    5
    thanks a lot for this nice code.
    i use it with fastled 3.1xx with no probs.

  3. #3
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    9,392
    You could try to uses on of the timers and do it "in hardware" - would by more reliable and you can use interrupts or DMA without negative effects to the timing.

  4. #4
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    1,084
    Since I had some questions via private message: the basic idea is to start a hardware timer, wait till runs out and then run NOPs for the remaining time. The timer is too granular to use it alone.

  5. #5
    Junior Member
    Join Date
    Mar 2019
    Posts
    1
    Quote Originally Posted by jonr View Post
    I had a need for sub micro second square wave pulses, so I wrote this. It looks fine on a scope. Comments are appreciated.

    Code:
    // Routine to delay for specified number of nano seconds
    // NOTE:  minimum pulse width is ~700 nsec, accuracy is ~ -0/+40 ns
    // NOTE:  you can't trust this code:
    //        compiler or library changes will change timing overhead
    //        CPU speed will effect timing
    
    // Jon Zeeff  V1.1
    // Public Domain
    // Written for teensy 3.1
    
    #define LED_PIN 13
    
    void setup() {
      delay(1000);
      pinMode(LED_PIN, OUTPUT);
      Serial.println("hello");
    }
    
    void loop() {
    
      //Setup_Nano_Delay(4000000000);
      Setup_Nano_Delay(700);
    
      Serial.println("start");
      delay(10);  // allow start message to go out
      noInterrupts();
      digitalWriteFast(LED_PIN, 1);
      Nano_Delay();
      digitalWriteFast(LED_PIN, 0);
      interrupts();
      Serial.println("stop\n");
    
      delay(2000);
    }
    
    // delay for a given number of nano seconds
    // less sensitive to interrupts and DMA
    // max delay is 4 seconds
    
    constexpr double   CLOCK_RATE = 96.00000E6;     // MCU clock rate - measure it for best accuracy
    constexpr unsigned NANO_OVERHEAD = 470;         // overhead - adjust as needed
    constexpr unsigned NANO_JITTER = 18;            // adjusts for jitter prevention - leave at 18
    
    // prepare before, so less delay later
    static uint32_t nano_ticks;
    
    void Setup_Nano_Delay(uint32_t nanos)
    {
      // set up cycle counter
      ARM_DEMCR |= ARM_DEMCR_TRCENA;
      ARM_DWT_CTRL |= ARM_DWT_CTRL_CYCCNTENA;
    
      // improve teensy 3.1 clock accuracy
      OSC0_CR = 0x2;
    
      // we can't do less than this
      if (nanos < NANO_OVERHEAD)
         nanos = NANO_OVERHEAD;
       
      // how many cycles to wait
      nano_ticks = ((nanos - NANO_OVERHEAD) / (1.0E9 / CLOCK_RATE)) + .5;
      
      if (nano_ticks < NANO_JITTER)
         nano_ticks = NANO_JITTER;
              
    } // Setup_Nano_Delay()
    
    // Do the delay specified above.
    // You may want to disable interrupts before and after
    
    FASTRUN void Nano_Delay(void)
    {
      uint32_t start_time = ARM_DWT_CYCCNT;
      uint32_t loop_ticks = nano_ticks - NANO_JITTER;
    
      // loop until time is almost up
      while ((ARM_DWT_CYCCNT - start_time) < loop_ticks) {
         // could do other things here kissanime.vip
      }
    
      if (NANO_JITTER) {   // compile time option
    
        register unsigned r;          // for debugging
        
        // delay for the remainder using single instructions
        switch (r = (nano_ticks - (ARM_DWT_CYCCNT - start_time))) {
          case 18: __asm__ volatile("nop" "\n\t");
          case 17: __asm__ volatile("nop" "\n\t");
          case 16: __asm__ volatile("nop" "\n\t");
          case 15: __asm__ volatile("nop" "\n\t");
          case 14: __asm__ volatile("nop" "\n\t");
          case 13: __asm__ volatile("nop" "\n\t");
          case 12: __asm__ volatile("nop" "\n\t");
          case 11: __asm__ volatile("nop" "\n\t");
          case 10: __asm__ volatile("nop" "\n\t");
          case 9: __asm__ volatile("nop" "\n\t");
          case 8: __asm__ volatile("nop" "\n\t");
          case 7: __asm__ volatile("nop" "\n\t");
          case 6: __asm__ volatile("nop" "\n\t");
          case 5: __asm__ volatile("nop" "\n\t");
          case 4: __asm__ volatile("nop" "\n\t");
          case 3: __asm__ volatile("nop" "\n\t");
          case 2: __asm__ volatile("nop" "\n\t");
          case 1: __asm__ volatile("nop" "\n\t");
          default:
               break;
        }  // switch()
      
      } // if
     
    }  // Nano_Delay()
    Exactly what I needed! You just saved me several hours. Thanks!

  6. #6
    Thanks guys, my application was Dshot for RC motor control. But in the end I had to resort to inline code with null loops, nope for 16MHz devices.

  7. #7
    Senior Member
    Join Date
    Jul 2020
    Posts
    174
    When I wrote a driver that needed 400-nanosecond-wide pulses, I generated a waveform in RAM and used DMA to transfer it to the PWM generator. The 'scope demonstrated that it was quite accurate. It doesn't take much RAM to do this, and other than issuing instructions to the DMA controller, the CPU is free to do its own thing (can process interrupts, and so on.) The DMA controller can be programmed to transfer multiple chained blocks, and you can even do a ring buffer if you want.

  8. #8
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    1,084
    What is the fastest rate (words/sec) that DMA can output?

  9. #9
    Senior Member
    Join Date
    Jul 2020
    Posts
    174
    That depends greatly on the microcontroller, bus, and signaling protocol. If there are data + clock lines, then you can have as many bits per second as states (low or high) per second. On the other hand, if there is no dedicated clock line, then the data line has to be self-clocking, and that usually means one bit takes more than one high/low state to transfer.

    The DMA on any modern Teensy can support 400-nanosecond-wide pulses (as evidenced by OctoWS2811) and I would guess it can get all the way down to double-digit nanoseconds. For precise numbers, you would have to look at the particular model's CPU manual or datasheet, which can be found here: https://www.pjrc.com/teensy/datasheets.html

    One good example of this is the OctoWS2811 library, which can drive anywhere from 1 to 8 NeoPixel arrays. Up to 8 PWM pins are driven by separate DMA channels. The total throughput is 20,000,000 states/second. Because the WS281x protocol requires three states to transmit one bit, that gives you 6.66 million bits/second, minus latch time. (At the end of every frame there's a continuous low signal, 50 microseconds I think, which means "latch." That causes the shift registers inside the LEDs to dump their contents to their PWM generators, resulting in the display of whatever color was sent.)

    On the more extreme end, if the DMA and PWM controllers can support 50-nanosecond pulses, you would be getting 20 megabits/sec on a single pin, which would be just right for talking to USB-3 hosts. (This doesn't take into account framing and error correction, so you are going a little slower than that. NeoPixels have no error correction, so their protocol is more straightforward.)

  10. #10

    Talking teensy

    what is the max freq it will support in Teensy 3.5?
    if yes which is the best option?

  11. #11
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    9,392
    Quote Originally Posted by Valarmathi A S View Post
    if yes which is the best option?
    Best option for what?

  12. #12
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    25,208
    Quote Originally Posted by Valarmathi A S View Post
    what is the max freq it will support in Teensy 3.5?
    To answer your question, I ran this code on a Teensy 3.5 and connected my oscilloscope.

    Code:
    void setup() {
      pinMode(13, OUTPUT);
      PORTC_PCR5 &= ~0x04; // disable slew rate limit
      noInterrupts();
      while (1) {
        digitalWriteFast(13, HIGH);
        digitalWriteFast(13, LOW);
        digitalWriteFast(13, HIGH);
        digitalWriteFast(13, LOW);
        digitalWriteFast(13, HIGH);
        digitalWriteFast(13, LOW);
      }
    }
    
    void loop() {
    }
    Click image for larger version. 

Name:	file.png 
Views:	59 
Size:	33.4 KB 
ID:	23075
    (click for full size)


    As you can see in the scope's measurements, the burst of 3 pulses is at 60 MHz. But the loop overhead causes a substantial delay between bursts, giving a 40 MHz overall pulse rate for this particular example.

    Hopefully the loop overhead can give you the concept that even though you can get very fast pulses using digitalWriteFast(), as a practical matter the surrounding code matters quite a lot for any real application. This example also completely disable interrupts, but in normal use cases they also come into play.

    Or I guess I could have just answered: 60 MHz. That is the maximum!

  13. #13
    Junior Member
    Join Date
    Mar 2021
    Posts
    4
    Quote Originally Posted by Pilot View Post
    When I wrote a driver that needed 400-nanosecond-wide pulses, I generated a waveform in RAM and used DMA to transfer it to the PWM generator. The 'scope demonstrated that it was quite accurate. It doesn't take much RAM to do this, and other than issuing instructions to the DMA controller, the CPU is free to do its own thing (can process interrupts, and so on.) The DMA controller can be programmed to transfer multiple chained blocks, and you can even do a ring buffer if you want.
    Hi, I think I am trying to do something similar to what you describe here. I am using a Teensy 4.1. I would like to generate signals that are around 100ns in width for on/off keying at 10MHz. Is it possible to update the PWM module this quickly using DMA? Would you be able to share your driver code? Thanks!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •