Generating 10 independant square waves

fitek · Aug 6, 2015

I'm guessing it's not possible to generate 10 square waves of different frequencies with one Teensy, but wanted to see if I was wrong about that - maybe some software way to get around it.

I am currently using a Mega to generate 4 such square waves in the range 15hz to 2khz. I actually had to use a different oscillator to get down to 15hz as the Mega ran too "fast"!

I see the Teensy LC has 7 timers, so another possibility is to have multiple Teensys - one that generates square waves 1-5 and another for 6-10, a third possibly as the brain to coordinate it all.

Thoughts?

Thanks

PaulStoffregen · Aug 6, 2015

Teensy-LC does have 7 timers, and Teensy 3.1 has 12 of them. But they are several different types of timers with very different capabilities. The Teensy-LC page has a tech specs table with the details: (scroll down to the table)

http://www.pjrc.com/teensy/teensyLC.html

The FTM timers are the ones which generate PWM. Each board has 3 of those, so you can get 3 different frequencies easily using the FTM timers.

I believe the CMT timer can generate a square wave, if you dig into the details of how to program it. Admittedly, the CMT timer is complex and Freescale's documentation on the CMT leaves a lot to be desired. I did use the CMT when I ported the FrequencyTimer2 library, so perhaps look at that code for an example. Or maybe that library will meet your needs?

If you're willing to accept a little jitter, you can generate a waveform with code using IntervalTimer. The tone() function does this, so maybe you can just use tone()? Or you can write your own. Since it's generated by interrupt code, varying interrupt latency will create some jitter. There are some techniques for jitter reduction, but of course it'll never be as perfect as a hardware generated waveform from the FTM or CMT timers.

If you use up all the PIT timers, you can also use the PDB, LPTMR and Systick to run an interrupt at a regular interval, which toggles a pin like tone(). Each of these timers has its own register set, similar to how AVR's 8 and 16 bit timers have very different registers. If you commandeer Systick, millis() and other Arduino timing functions won't work.

If you go to the trouble to write code for all these different timers, I believe you should be able to get 3 hardware waveforms and 4 interrupt waveforms on Teensy-LC, or 4 hardware and 7 interrupt ones on Teensy 3.1.

doughboy · Aug 6, 2015

Are you bit banging an output pin based on timer interrupt, or using waveform generation feature of the pin?

from this
https://www.pjrc.com/teensy/td_pulse.html
if you scroll down to pwm frequency section, there are only 3 possible frequencies (since the frequency you set applies to all the pins in the group).
I ran into this issue and realized I have to run one group at 25khz and another at 1khz and cannot just pick any pwm pin to run at a different freq.
I take it since you say square wave, you want it to be 50% duty cycle.

Nominal Animal · Aug 9, 2015

I wonder what kind of performance you would get from a pure software loop generating all waveforms in parallel. Obviously, any kind of work other than generating the waveforms -- changing any frequency, responding to an USB packet, anything -- would cause jitter on all waveforms generated. I'm too lazy to test for myself -- also, I don't have an oscilloscope (good enough for measuring something like this).

If a "pause" or a "glitch" whenever changes are applied or USB communications occur is acceptable, this might be an acceptable solution.

For 50% duty cycle, for each waveform:

Code:

    uint32_t accumulator, rate;

    {
        const uint32_t old = accumulator;
        accumulator += rate;
        if (accumulator < old)
            PORT ^= MASK;
    }

The idea is that whenever the accumulator wraps around, the output pin is flipped. This can be written in very tight assembly code, too, especially if the pins are consecutive in the same port.

Initially, accumulator=0, and rate=2147483648.0*wave_freq/loop_freq, where wave_freq is the desired frequency for the waveform, and loop_freq is the frequency the above code is run. Lower frequencies are more precise, as the frequencies are quantized to the nearest integer rate; the theoretical frequency being wave_freq=rate*loop_freq/2147483648.0 . The inherent jitter is 2/loop_freq , as the pulse length varies between 2n and 2n+2 loop iterations.

With slightly more complex code, and 2N+1 additional 32-bit unsigned integers per waveform, you could generate N-pulse trains, too.

I don't have an oscilloscope, and I'm anyway just waving my hands here (I'd put this rambling at 85% reliability), but I think you could get loop_freq to about 1 MHz for up to sixteen waveforms (two ports). Assuming that was the exact rate (I'd measure it in practice!), rate=32212 would correspond to wave_freq=14.9999 Hz, and rate=42949673 would correspond to wave_freq=20000.0000 Hz.

Another approach is to use a single timer interrupt, reprogramming it to the next pulse edge, and do all work in the timer interrupt. This would let the Teensy do other work at the same time. There would be slightly more work in the timer interrupt, so it might cause jitter to everything else; I'm not sure if it would negatively affect other stuff (especially USB). Would need testing to verify.

PaulStoffregen · Aug 9, 2015

Yet another crazy idea, which might give pretty good performance and low CPU usage, might use the FTM compare match to trigger DMA channels.

Each "minor loop" would need to do 2 writes: toggle a GPIO pin, and also write a new COV value to the FTM channel. A big list of the GPIO bitmap interleaved with FTM compare points would be needed in RAM. The DMA half used interrupt could be used to run code to generate more data in half of the buffer as the DMA channel uses it up the from the other half. If the buffer is large, like 400 compare points (4 bytes per point), DMA interrupts should occur at 1/100th of the desired frequency. Even for 20 kHz, that only 200 interrupts per second. Each has to recompute 100 more, but for a fixed frequency the algorithm is simply adding the half-period time to the anticipated timer value of the previous compare point. Eight of those interrupts running at 200 Hz should use very little CPU time, and each can tolerate quite a lot of interrupt latency... you get "perfect" output as long as it's able to run and refill the other half of the buffer before the current half gets used up.

Of course, the DMA engine has slight latency to gain access to the bus matrix, but it's usually 10X less than best-case interrupt latency, and *far* better than having to wait for same or higher priority interrupts. It also doesn't get impacted by code that needs to temporarily disable interrupts.

This DMA-based idea would only work on Teensy 3.1. Teensy-LC has only 4 DMA channels, and they're simpler, lacking the half-full interrupt or flexible minor loop feature. Only the more sophisticated DMA in 3.1 could pull this off.

Nominal Animal · Aug 9, 2015

You mentioned here that you can get IntervalTimer jitter down to 40ns or so on Teensy 3.x; what about Teensy LC? What features would be lost, if one were to reappropriate the SysTick for this?

Consider the following state:

Code:

static          uint8_t systick_state[256];       /* For 8 outputs */
static          uint8_t systick_interval[256][3]; /* 24-bit values */
static volatile uint8_t systick_head;
static          uint8_t systick_tail;

Now, systick would simply reprogram the next interval to systick_interval[systick_head], set the output pins to systick_state[systick_head], and increment systick_head.

Another interrupt, or perhaps the main code, would generate more transitions systick_state[systick_tail], systick_interval[systick_tail], whenever systick_tail != systick_head.

The number of new states to compute each second is twice the sum of the frequencies in Hertz, but it is very lightweight (just additions, substractions, and comparisons), so that should not pose any problems. All eight transitions may occur just after one another, so any set of N+8 transitions must take at least N half-cycles of the highest-frequency square wave. This means that 256 transitions is quite sufficient for buffering.

A bigger problem is that the initial jitter would cause a cumulative drift. If the average delay is constant or easily estimated, then it should be possible to compensate for it.

Probably not. So, consider that just an idea that is unlikely to work in practice.

Edited to add: Using two DMA channels, linked, on the Teensy LC, should work. If the next state change is far in the future, the DMA pushes out a static byte for the duration. Otherwise, a buffer is used to contain the near-future state changes -- say, using three 1024-byte buffers (triple buffering) --, so that each DMA'ed block is at least a minimum length. The completion interrupt is used to set up the next DMA, so it only needs to occur before the linked (just started) DMA completes.

Raises some very interesting possibilities for a low cost pulse wave generator, I think.

fitek · Aug 21, 2015

Well I managed to create 10 pretty good square waves using just the interval timer set to 5ms. I tried 1ms but the wave was not consistent. I had to really optimize my timer function so that it is just one loop and one simple if. Next up, I'll throw serial comms into the mix and see how poorly that goes, though if the disturbance is minimal, I should be OK.

Generating 10 independant square waves

fitek

Member

PaulStoffregen

Well-known member

doughboy

Well-known member

Nominal Animal

Well-known member

PaulStoffregen

Well-known member

Nominal Animal

Well-known member

fitek

Member