Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 9 of 9

Thread: Sub Micro Second Pulses

  1. #1
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    654

    Sub micro second pulses

    I had a need for sub micro second square wave pulses, so I wrote this. It looks fine on a scope. Comments are appreciated.

    Code:
    // Routine to delay for specified number of nano seconds
    // NOTE:  minimum pulse width is ~700 nsec, accuracy is ~ -0/+40 ns
    // NOTE:  you can't trust this code:
    //        compiler or library changes will change timing overhead
    //        CPU speed will effect timing
    
    // Jon Zeeff  V1.1
    // Public Domain
    // Written for teensy 3.1
    
    #define LED_PIN 13
    
    void setup() {
      delay(1000);
      pinMode(LED_PIN, OUTPUT);
      Serial.println("hello");
    }
    
    void loop() {
    
      //Setup_Nano_Delay(4000000000);
      Setup_Nano_Delay(700);
    
      Serial.println("start");
      delay(10);  // allow start message to go out
      noInterrupts();
      digitalWriteFast(LED_PIN, 1);
      Nano_Delay();
      digitalWriteFast(LED_PIN, 0);
      interrupts();
      Serial.println("stop\n");
    
      delay(2000);
    }
    
    // delay for a given number of nano seconds
    // less sensitive to interrupts and DMA
    // max delay is 4 seconds
    
    constexpr double   CLOCK_RATE = 96.00000E6;     // MCU clock rate - measure it for best accuracy
    constexpr unsigned NANO_OVERHEAD = 470;         // overhead - adjust as needed
    constexpr unsigned NANO_JITTER = 18;            // adjusts for jitter prevention - leave at 18
    
    // prepare before, so less delay later
    static uint32_t nano_ticks;
    
    void Setup_Nano_Delay(uint32_t nanos)
    {
      // set up cycle counter
      ARM_DEMCR |= ARM_DEMCR_TRCENA;
      ARM_DWT_CTRL |= ARM_DWT_CTRL_CYCCNTENA;
    
      // improve teensy 3.1 clock accuracy
      OSC0_CR = 0x2;
    
      // we can't do less than this
      if (nanos < NANO_OVERHEAD)
         nanos = NANO_OVERHEAD;
       
      // how many cycles to wait
      nano_ticks = ((nanos - NANO_OVERHEAD) / (1.0E9 / CLOCK_RATE)) + .5;
      
      if (nano_ticks < NANO_JITTER)
         nano_ticks = NANO_JITTER;
              
    } // Setup_Nano_Delay()
    
    // Do the delay specified above.
    // You may want to disable interrupts before and after
    
    FASTRUN void Nano_Delay(void)
    {
      uint32_t start_time = ARM_DWT_CYCCNT;
      uint32_t loop_ticks = nano_ticks - NANO_JITTER;
    
      // loop until time is almost up
      while ((ARM_DWT_CYCCNT - start_time) < loop_ticks) {
         // could do other things here
      }
    
      if (NANO_JITTER) {   // compile time option
    
        register unsigned r;          // for debugging
        
        // delay for the remainder using single instructions
        switch (r = (nano_ticks - (ARM_DWT_CYCCNT - start_time))) {
          case 18: __asm__ volatile("nop" "\n\t");
          case 17: __asm__ volatile("nop" "\n\t");
          case 16: __asm__ volatile("nop" "\n\t");
          case 15: __asm__ volatile("nop" "\n\t");
          case 14: __asm__ volatile("nop" "\n\t");
          case 13: __asm__ volatile("nop" "\n\t");
          case 12: __asm__ volatile("nop" "\n\t");
          case 11: __asm__ volatile("nop" "\n\t");
          case 10: __asm__ volatile("nop" "\n\t");
          case 9: __asm__ volatile("nop" "\n\t");
          case 8: __asm__ volatile("nop" "\n\t");
          case 7: __asm__ volatile("nop" "\n\t");
          case 6: __asm__ volatile("nop" "\n\t");
          case 5: __asm__ volatile("nop" "\n\t");
          case 4: __asm__ volatile("nop" "\n\t");
          case 3: __asm__ volatile("nop" "\n\t");
          case 2: __asm__ volatile("nop" "\n\t");
          case 1: __asm__ volatile("nop" "\n\t");
          default:
               break;
        }  // switch()
      
      } // if
     
    }  // Nano_Delay()
    Last edited by jonr; 07-07-2015 at 06:02 PM.

  2. #2
    Junior Member
    Join Date
    Jun 2016
    Posts
    5
    thanks a lot for this nice code.
    i use it with fastled 3.1xx with no probs.

  3. #3
    Senior Member+ Frank B's Avatar
    Join Date
    Apr 2014
    Location
    Germany NRW
    Posts
    6,928
    You could try to uses on of the timers and do it "in hardware" - would by more reliable and you can use interrupts or DMA without negative effects to the timing.

  4. #4
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    654
    Since I had some questions via private message: the basic idea is to start a hardware timer, wait till runs out and then run NOPs for the remaining time. The timer is too granular to use it alone.

  5. #5
    Junior Member
    Join Date
    Mar 2019
    Posts
    1
    Quote Originally Posted by jonr View Post
    I had a need for sub micro second square wave pulses, so I wrote this. It looks fine on a scope. Comments are appreciated.

    Code:
    // Routine to delay for specified number of nano seconds
    // NOTE:  minimum pulse width is ~700 nsec, accuracy is ~ -0/+40 ns
    // NOTE:  you can't trust this code:
    //        compiler or library changes will change timing overhead
    //        CPU speed will effect timing
    
    // Jon Zeeff  V1.1
    // Public Domain
    // Written for teensy 3.1
    
    #define LED_PIN 13
    
    void setup() {
      delay(1000);
      pinMode(LED_PIN, OUTPUT);
      Serial.println("hello");
    }
    
    void loop() {
    
      //Setup_Nano_Delay(4000000000);
      Setup_Nano_Delay(700);
    
      Serial.println("start");
      delay(10);  // allow start message to go out
      noInterrupts();
      digitalWriteFast(LED_PIN, 1);
      Nano_Delay();
      digitalWriteFast(LED_PIN, 0);
      interrupts();
      Serial.println("stop\n");
    
      delay(2000);
    }
    
    // delay for a given number of nano seconds
    // less sensitive to interrupts and DMA
    // max delay is 4 seconds
    
    constexpr double   CLOCK_RATE = 96.00000E6;     // MCU clock rate - measure it for best accuracy
    constexpr unsigned NANO_OVERHEAD = 470;         // overhead - adjust as needed
    constexpr unsigned NANO_JITTER = 18;            // adjusts for jitter prevention - leave at 18
    
    // prepare before, so less delay later
    static uint32_t nano_ticks;
    
    void Setup_Nano_Delay(uint32_t nanos)
    {
      // set up cycle counter
      ARM_DEMCR |= ARM_DEMCR_TRCENA;
      ARM_DWT_CTRL |= ARM_DWT_CTRL_CYCCNTENA;
    
      // improve teensy 3.1 clock accuracy
      OSC0_CR = 0x2;
    
      // we can't do less than this
      if (nanos < NANO_OVERHEAD)
         nanos = NANO_OVERHEAD;
       
      // how many cycles to wait
      nano_ticks = ((nanos - NANO_OVERHEAD) / (1.0E9 / CLOCK_RATE)) + .5;
      
      if (nano_ticks < NANO_JITTER)
         nano_ticks = NANO_JITTER;
              
    } // Setup_Nano_Delay()
    
    // Do the delay specified above.
    // You may want to disable interrupts before and after
    
    FASTRUN void Nano_Delay(void)
    {
      uint32_t start_time = ARM_DWT_CYCCNT;
      uint32_t loop_ticks = nano_ticks - NANO_JITTER;
    
      // loop until time is almost up
      while ((ARM_DWT_CYCCNT - start_time) < loop_ticks) {
         // could do other things here kissanime.vip
      }
    
      if (NANO_JITTER) {   // compile time option
    
        register unsigned r;          // for debugging
        
        // delay for the remainder using single instructions
        switch (r = (nano_ticks - (ARM_DWT_CYCCNT - start_time))) {
          case 18: __asm__ volatile("nop" "\n\t");
          case 17: __asm__ volatile("nop" "\n\t");
          case 16: __asm__ volatile("nop" "\n\t");
          case 15: __asm__ volatile("nop" "\n\t");
          case 14: __asm__ volatile("nop" "\n\t");
          case 13: __asm__ volatile("nop" "\n\t");
          case 12: __asm__ volatile("nop" "\n\t");
          case 11: __asm__ volatile("nop" "\n\t");
          case 10: __asm__ volatile("nop" "\n\t");
          case 9: __asm__ volatile("nop" "\n\t");
          case 8: __asm__ volatile("nop" "\n\t");
          case 7: __asm__ volatile("nop" "\n\t");
          case 6: __asm__ volatile("nop" "\n\t");
          case 5: __asm__ volatile("nop" "\n\t");
          case 4: __asm__ volatile("nop" "\n\t");
          case 3: __asm__ volatile("nop" "\n\t");
          case 2: __asm__ volatile("nop" "\n\t");
          case 1: __asm__ volatile("nop" "\n\t");
          default:
               break;
        }  // switch()
      
      } // if
     
    }  // Nano_Delay()
    Exactly what I needed! You just saved me several hours. Thanks!

  6. #6
    Thanks guys, my application was Dshot for RC motor control. But in the end I had to resort to inline code with null loops, nope for 16MHz devices.

  7. #7
    Senior Member
    Join Date
    Jul 2020
    Posts
    174
    When I wrote a driver that needed 400-nanosecond-wide pulses, I generated a waveform in RAM and used DMA to transfer it to the PWM generator. The 'scope demonstrated that it was quite accurate. It doesn't take much RAM to do this, and other than issuing instructions to the DMA controller, the CPU is free to do its own thing (can process interrupts, and so on.) The DMA controller can be programmed to transfer multiple chained blocks, and you can even do a ring buffer if you want.

  8. #8
    Senior Member
    Join Date
    May 2015
    Location
    USA
    Posts
    654
    What is the fastest rate (words/sec) that DMA can output?

  9. #9
    Senior Member
    Join Date
    Jul 2020
    Posts
    174
    That depends greatly on the microcontroller, bus, and signaling protocol. If there are data + clock lines, then you can have as many bits per second as states (low or high) per second. On the other hand, if there is no dedicated clock line, then the data line has to be self-clocking, and that usually means one bit takes more than one high/low state to transfer.

    The DMA on any modern Teensy can support 400-nanosecond-wide pulses (as evidenced by OctoWS2811) and I would guess it can get all the way down to double-digit nanoseconds. For precise numbers, you would have to look at the particular model's CPU manual or datasheet, which can be found here: https://www.pjrc.com/teensy/datasheets.html

    One good example of this is the OctoWS2811 library, which can drive anywhere from 1 to 8 NeoPixel arrays. Up to 8 PWM pins are driven by separate DMA channels. The total throughput is 20,000,000 states/second. Because the WS281x protocol requires three states to transmit one bit, that gives you 6.66 million bits/second, minus latch time. (At the end of every frame there's a continuous low signal, 50 microseconds I think, which means "latch." That causes the shift registers inside the LEDs to dump their contents to their PWM generators, resulting in the display of whatever color was sent.)

    On the more extreme end, if the DMA and PWM controllers can support 50-nanosecond pulses, you would be getting 20 megabits/sec on a single pin, which would be just right for talking to USB-3 hosts. (This doesn't take into account framing and error correction, so you are going a little slower than that. NeoPixels have no error correction, so their protocol is more straightforward.)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •