Looking for an example of how to use FlexPWM and DMA together (Teensy 4.1)

Status
Not open for further replies.

NikCA

Member
Hello everyone, :D

I'm trying to create a simple DShot implementation with the T4.1 so I can control ESCs of a quadcopter.

I have a general understanding of why DMA is a good choice for DShot and why I need some kind of timer in order to accurately transfer the data to the ESC. (That said I'm certain I don't know everything so any info is always welcome! )

I've gathered some information about the different elements that I think are required to solve this problem:


1. For DMA from various forum posts, I think that its best to use this library: https://github.com/PaulStoffregen/cores/blob/master/teensy4/DMAChannel.h
Making my own would be quite a stretch, considering I haven't worked with DMA up to now.

2. For the pulse length & timing that DShot I think its best if I use the FlexPWM timer (this is based on what I've read from the reference manual)
Info about the DShot protocol >> https://blck.mn/2016/11/dshot-the-new-kid-on-the-block/


So these are the two things I think I need in order to implement DShot on the teensy.


The questions that remains is how do I make them work together:

1. How do I setup the destination of the DMA to be the output of FlexPWM?
2. Once I write something in the DMA source, how do I make the DMA trigger based on the value of the FlexPWM timer and once triggered?

I hope I've explained my question well enough, if there is any confusion please let me know. :)
 
I'm not sure whether FlexPWM is useful to transmit a finite stream of pulses - it is possible to use DMA to vary the pulse width but making the stream start and stop precisely could be difficult...

It looks like DShot is pretty similar to the WS2811 protocol. It may be possible to use a DMA to GPIO technique like Paul did for OctoWS2811.

The easy way would be to simply toggle a pin in a timed loop, if you don't need to multitask.
 
Thank you! :D

I'll take a deeper look at it, I did come across it but I didn't think it would work.

Also, I don't think I need to multitask, the ESCs get the instructions sequentially. (I'm also not 100% sure about this, but I'll have to try it out to see if there are any stability issues.
 
You should try to modify the WS2812Serial library, rather than go down this deep rabbit hole of using the PWM hardware with DMA!

https://github.com/PaulStoffregen/WS2812Serial

Here is a blog article I wrote about how WS2812Serial works. Hopefully it will give you the background info you need to start editing the code.

https://www.pjrc.com/non-blocking-ws2812-led-library/

The library code is about 360 lines, but the majority of them are just copies of hardware init for the various Teensy boards. Probably your very first step should be to trim away all the cases for Teensy 3.x & LC. Less code is easier code!

Another big block of code is the show() function, which translates the raw pixel data into the bit patterns to actually transmit. There too, pick one of the many cases and delete the rest. Ultimately each 2 bits of real data becomes 1 byte to transmit (which is still much less than 2 or 4 PWM reload numbers at 16 bits each). You won't need all that color order translation stuff, so just hard-code the bits to bytes order.

If you get stuck, I recommend just ignoring the first buffer and copy some constant bytes into the transmit buffer and watch the result on an oscilloscope or logic analyzer. Once you understand what the results of writing various bytes actually does, then work on how to translate your data bit pairs into the bytes to actually transmit on the wire.

To make it run at 600 or 300 speed rather than 800, you'll need to edit this line.

Code:
	uart->BAUD = LPUART_BAUD_OSR(5) | LPUART_BAUD_SBR(1) | LPUART_BAUD_TDMAE;  // set baud configure for transfer DMA

You'll need to look up the OSR and SBR fields of the BAUD register in the IMXRT1060 reference manual, LPUART chapter.

https://www.pjrc.com/teensy/datasheets.html

The LPUARTs run at 24 MHz. Those existing settings divide it by 6, and then every 5 bit times becomes 1 bit of the actual protocol. So for 600 speed, you'll need to divide 24 MHz by 8. Hopefully figuring out the right OSR & SBR numbers should be relatively easy.

When you do get it running, I hope you'll be willing to share the known-good code and anything you learn along the way. Maybe it'll help others who want to use this special protocol?
 
Thank you very much for the guidance Paul! As a newcomer to Teensy (and the MCU world as a whole), it makes all the difference to have someone with experience push you in the right direction, especially considering I was headed headfirst right towards that rabbit hole!

I hope to have this running soon-ish, if I have any other questions I'll write them here (I'm not 100% how notifications work on this forum, so I hope any of the following replies send notifications)

I'll be sure to post an update on this thread, as well as a post about the whole project on the General channel once I'm done! Sharing knowledge is the heartbeat of the maker community! :D
 
So I've read through the library code and the description, and I looked at the manual and I have a couple of questions.

1. When you say "Ultimately each 2 bits of real data becomes 1 byte to transmit (which is still much less than 2 or 4 PWM reload numbers at 16 bits each)." Did you mean 2bits become 1 bit, and could you give a simple example?

2. Is the speed of the transmitter faster than the required protocol speed so that data is transferred faster, or is there something else? (for example, instead of getting 1 bit at a time, the receiver will get 5bits - that's how I understand it)

3. From what I've read about the protocols, the difference between a 0 and a 1 is that a 0 is a shorter pulse and a 1 is a longer pulse. Is the length of the pulse made by just transmitting for example 110 to indicate a 1 or does it work another way?

There is a part of the code:

Code:
const uint8_t *stop = fb + 16;
do {
	uint8_t x = 0x08;
	if (!(n & 0x80000000)) x |= 0x07;
	if (!(n & 0x40000000)) x |= 0xE0;
	n <<= 2;
	*fb++ = x;
} while (fb < stop);

that I assume does this "translation" from 1's and 0's to the actual 1's and 0's that need to be transmitted down the wire, but I just can't get why it is the way it is.
 
Please try to read this page again. It has the answers to all those questions.

https://www.pjrc.com/non-blocking-ws2812-led-library/

Especially this:

The serial port is configured to run at 4 Mbit/sec, which is exactly 5 times the 800 kbit/sec speed WS2812 LEDs expect. Every 5 data bits becomes one cycle of the WS2812 signal.

ws2812waveform.jpg
 
Ok, I think I've got it, but I'll write the explanation as simple as I can.

So basically, the "why" behind the faster rate is so that the receiver can get this "packet" of 5 bits in one go instead of seeing the 1's and 0's one at a time. This is how the length of the pulse is generated.

The question about 2bits becoming 1 byte, is basically due to the fact that to send the information of 2 bits you need to use 8, because of the way a 1 and 0 is represented in the protocol.

(Sorry for asking questions that already have the answers, I'm just really new to this and it takes me a while to get my head around the details and I need to break everything down into absolute basics in order to understand it :D)
 
For every byte you send, hardware serial transmits a start and a stop bit. Every 8 bit byte you transmit becomes 10 actual bit times transmitted, where 2 of the bits are fixed. That's simply how serial transmission works. It's a very old & simple protocol.

Fortunately, those 2 fixed bits you can't control are exactly the sort of thing you want. That's the point of the hand-drawn diagram, where both of them are labeled "S". You really should use an oscilloscope or logic analyzer to view the waveforms you are creating. Again, as I suggested earlier, if you need to see how things really work the best way is to just add some code which writes a few fixed numbers into the transmit buffer, and then do a short transmission. It will all much much more sense when you can see the actual waveforms change in response to the numbers you put into the buffer. Get that working first, and then build upon that success to achieve your end goal.

The choice to send 2 bits in each byte isn't written in stone. In an early version of this library, I tried to send 3, configuring the hardware for 7 data bits which gives 9 when start and stop are added by the hardware, and of course a baud rate of 2.4 Mbit/sec. But having only 3 bit times (417 ns each) didn't get enough timing resolution. Some LEDs would recognize a 417 ns wide pulse as a zero, but others needed the pulse to be shorter to reliably see it as zero. So instead I went with 4 Mbit/sec, which gives 250 ns bit times. All the WS2812 LEDs very reliably recognize a 250 ns pulse as zero and 1000 ns as one.

While less memory efficient, you could opt for just 1 bit time per byte (10 bits of serial transmission). To get 600 kHz, you'd configure the baud rate for 6 Mbit/sec. That would give you 167 ns serial bit times. You can't achieve exactly the 625 ns pulse this protocol wants. You have to round off to some number of serial transmission bit times. 4 of those bits gives 667 ns. 3 of them gives 500 ns, which isn't as close to 625 ns, but maybe erring on the shorter side is better?

Whatever way you go, best to hook up a scope or logic analyzer so you can see the actual waveform. Then start experimenting with bytes in the buffer and see the actual resulting signal. Ultimately your code will just fill the transmit buffer up with bytes and then use the existing code to make then transmit on the wire at whatever baud rate you configure. You could send a lot of time thinking and analyzing, but ultimately experimenting and seeing the actual waveforms is the way to really understand this, as well as confirm you've really made it work properly. So please, do yourself a favor and get an oscilloscope or logic analyzer hooks up to the transmit pin and start by just running the WS2812Serial examples for the sake of knowing you gear is capturing the waveform.



FWIW, the history of start and stop bits goes back a LONG way. The old serial protocol started with ancient teletype machines, which used 5 data bits and more stop bit time. The start bit caused a solenoid to release a mechanical clutch. Then a wheel would rotate at approximately speed of the incoming bits, where each bit would or would not trigger a pin to be pressed by a solenoid as it rotated by. At the end of this cycle, the pins would cause one of the typewriters keys to be selected to mechanically strike the paper, of course through an ribbon with ink. During the stop bits, the wheel would continue rotating back to its original position where the clutch would grab and hold it, until the next stop bit arrived to again release it and start the whole cycle all over again. I believe those old teletype machines used 50 bits/sec speed. Today we still call those bits start and stop, referring to the physical wheel in those old teletype machines starting to rotate when the clutch would release, and then stop again after the key struck the paper.
 
One more quick tip. If you do use an oscilloscope, the trick to making it work for this use is the "hold off time", and make sure triggering (on rising edge) is "normal" rather than "auto".

Set the hold off time to longer than all the data you will transmit. In your code, put a lengthy delay significantly longer than the hold off time between each transmission. This way, your oscilloscope will start capturing at the beginning of the data and "hold off" restarting throughout the entire time. Oscilloscopes are very powerful but setting them up correctly requires a lot of experience. Retriggering within the data just end up showing a confusing jumble on the screen, so set the hold off time correctly to avoid that problem.

A cheap logic analyzer which captures a very long time to software on your PC might be easier if you're not experienced with an oscilloscope.
 
Thank you for the clarification! The history of the serial protocol is also interesting.

I've been working on the code all day today, I hooked up my oscilloscope to it and I managed to see the waveform that comes from the library directly. After that initial test, I started tinkering and trying to understand what does what.

At the moment I've managed to sort of output my desired waveform. Here is what I did:

Since a frame for DShot is 16bits (2bytes) I changed the "drawingMemory" buffer to have a length of 2 and the "displayMemory" to 8 (following the logic for the LEDs - for each byte of the LEDs I need 4 to compose the serial output). Here is the code I'm using:

This is all I changed in the header file, this is how I set the drawBuffer to the value I want. Its supposed to be 0x800F, (I think I had to switch the order because the Teensy is LSB)
Code:
void setPixel(uint32_t num, uint32_t color) {
		if (num >= numled) return;

		drawBuffer[0] = 0x0F;
		drawBuffer[1] = 0x80;
	}

This is the show function.
Code:
void WS2812Serial::show()
{
	uint32_t microseconds_per_led, bytes_per_led;

	// wait if prior DMA still in progress
	while ((DMA_ERQ & (1 << dma->channel)))
	{
		yield();
	}

	// copy drawing buffer to frame buffer
	const uint8_t *p = drawBuffer;
	uint8_t *fb = frameBuffer;

	uint8_t firstHalf = *p++;  
	uint8_t secondHalf = *p;
	uint16_t n = 0;

	n = (firstHalf <<8) | (secondHalf);

	const uint8_t *stop = fb + 8; //The logic behind this being 8 is because in order to compose serial data for 2 bytes I just need to perform the while loop only 8 times
	do
	{
		uint8_t x = 0x08;
		if (!(n & 0x00800000)) //Here I tested removing 2 0s from then end, the logic being that I'm working with.
			x |= 0x07;
		if (!(n & 0x00400000)) 
			x |= 0xE0;
		n <<= 2;
		*fb++ = x;
	} while (fb < stop);

	microseconds_per_led = 30;
	bytes_per_led = 8; //Again changing this because of the 2byte frame.

	// wait 300us WS2812 reset time
	uint32_t min_elapsed = (numled * microseconds_per_led) + 300;
	if (min_elapsed < 2500)
		min_elapsed = 2500;
	uint32_t m;
	while (1)
	{
		m = micros();
		if ((m - prior_micros) > min_elapsed)
			break;
		yield();
	}
	prior_micros = m;

	// start DMA transfer to update LEDs  :-)
	// See if we need to muck with DMA cache...
	if ((uint32_t)frameBuffer >= 0x20200000u)
	{
		arm_dcache_flush(frameBuffer, bytes_per_led);
	}

	dma->sourceBuffer(frameBuffer, bytes_per_led);
	//dma->transferSize(1);
	dma->transferCount(bytes_per_led);
	dma->disableOnCompletion();

	uart->STAT = 0; // try clearing out the status
	dma->enable();
}


This is the example program I'm using, it's based on the basic example from the library
Code:
#include <LedLib.h>
#include <Arduino.h>

const int numled = 1;
const int pin = 1;

byte drawingMemory[2];        //  2 bytes per DSHOT Frame
DMAMEM byte displayMemory[8]; // This means 8 total bytes to compose serial data.

WS2812Serial leds(numled, displayMemory, drawingMemory, pin, WS2812_RGB);

void colorWipe(int color, int wait)
{
  for (int i = 0; i < leds.numPixels(); i++)
  {
    leds.setPixel(i, color);
    leds.show();
    delayMicroseconds(wait);
  }
}

void setup()
{
  leds.begin();
}

void loop()
{
  // change all the LEDs in 1.5 seconds
  int microsec = 300000 / leds.numPixels();
  colorWipe(0xF001, microsec);
}

After all of these changes I assumed that I'd send only 16bits of information, but I send 24bits- the same amount that the untouched code sent, the only difference is that the last 16bits are the data I want to send and there are 8 bits in the front and I don't know where they come from, considering the DMAMEM buffer is only 8.

This is the waveform I got before
Original.png

This is the waveform I got after the code modifications
After Code Mods.png

Its those 8bits in the front that confuse me the most. I've written out the logic behind the changes on paper a couple of times and they seem correct, I also wrote out how the code that composes the serial data works and I got the hang of it (its pretty clever!)
 
An update from the tests I did this morning - I'm very confused as to how this code works.

I've deleted literally everything from the show function, I've commented out all the DMA initialization code, the buffer sizes are set to 0, the serial configuration lines are also removed and yet I get a waveform consisting of 24 0 bits.

I'm using Visual Studio code with Platformio as my development environment, and I've even tried moving all the code to a clean project, in case there is any weird caching going on, and the waveform still shows up.

(its not an oscilloscope issue, as when I turn off the Teensy the signal goes away as expected)

Only after commenting out the lines in the demo program, do I not get the waveform.
 
Ok, did another complete clean and started fresh. I touched just the values that are needed to change the size of the frame being transmitted and I managed to make it work. The road ahead seems straight forward, but I'll post if any further issues come up. (I'm writing this as an update to make the above questions as "solved")
 
Last edited:
Status
Not open for further replies.
Back
Top