Interrupt on Rising and Falling on the same pin

macardoso · Mar 7, 2022

joepasquariello said:
@luni, glad to hear you eliminated the need for popLocked(). I think it's important to clarify that disabling interrupts will not interfere with the capture of edges, but rather with the response to the capture interrupt. If an edge arrives while interrupts are disabled, and interrupts are re-enabled before the next edge occurs, the interrupt will occur as soon as interrupts are re-enabled, and the same value will be read from the capture register as if interrupts had never been disabled. It's definitely better to avoid disabling interrupts, but as long as interrupt disable periods are less than the time between edges, with some margin, then no edges will be missed, and the decoder will get exactly the same data as if there were no disable/enable. For quadrature counting, the counting continues normally while interrupts are disabled, but if your code depends on reading the count at precise intervals via a timer interrupt, that read can be delayed by however long interrupts are disabled, and that can the accuracy of an inferred frequency.

I think @luni was replying to my earlier statement regarding the need for this application to do things beyond just reading the serial steam. I'll actually need to track a quadrature encoder and respond to over/underflow interrupts (I think this is done in hardware) as well as read/write a second serial channel (2.5Mbaud, NRZ). Although that brings a good point... Does NoInterrupts() block serial data from reaching the serial buffer on a hardware serial port?

macardoso · Mar 7, 2022

luni said:
Spoiler alarm: If you want you can have a look at the new code in the gitHub repo. I improved the edge detection code to be much faster and changed the decoder to handle TS5643 data fields. The receiver should run out of the box and display the encoder counts (it also parses the various flags if you are interested). I observed that from time to time some high priority interrupt delays the edge detection for about 1µs so that you'll get reading errors. Most of those will be caught but since I didn't implement the CRC some of those might pass. I wasn't able to find the actual interrupt source but the errror rate increases when you print something. So, probably related to the USB system...

Next thing I want to try is the DMA path...

Holy smokes, you got that done fast! And no joke, it works perfectly. I have to dive into the code to understand how this one works differently. I'll try to tackle the CRC too

Before looking, I'm just curious if you included protection for matching frame 1 to a subsequent frame 2. For example, if frame 1 arrives, then frame 2 is missed, and so are a bunch of other bursts, then a later frame 2 is captured, is that pair thrown away? Or more likely if the first packet it grabs is a frame 2 without a preceding frame 1, is it thrown away?

Anyways thanks so very much for taking interest in my project and helping me along. I still want to work through all the logic again so I hope you don't mind if I post a few more questions if I run into any!

macardoso · Mar 7, 2022

luni said:
Usually you want to separate the actual code (the definition) from the declaration in the header. There are various reasons for that.

If you have all the code in the header it needs to be compiled fore each and every compilation unit which includes the header. In large projects (and old times) this would generate a significant compilation time penalty. For the tiny Arduino stuff and the fast computers we have now compilation time is usually not a big concern anymore.

For commercial projects you don't want to distribute your source code. The only thing users of a libraries need are the headers with the declarations. The actual code can then be distributed in a compiled format (object files *.o)

Encapsulation. Generally you try to hide away as much information about your implementation as possible and provide only a small interface to the users of your code. (google for "code against interfaces not implemenations" if you want to learn about this technique). So, better not expose your code in the header.

However, for templated code like the one in RingBuf.h having the code in the header is mandatory.

But, again, given that this is all for hobby and it is quicker to write, you see code in headers more often these days

That makes a lot of sense! Thanks for the exlpanation!

luni said:
No, this is not inline assembly but perfectly valid c++ code. I didn't want to hard code the address of the second channel of the TMR1 module. If want to use another timer module/channel you only need to adjust this line. It defines a pointer to an object of type IMXRT_TMR_CH_t which is defined in imxrt.h (see here: https://github.dev/PaulStoffregen/c...dbca8032763fe97e2a99e7e/teensy4/imxrt.h#L7884) Information on the TMR timers is found in chapter 54 of the IMRT manual. imxrt.h defines the vast majority of the symbols you find in the manual.

Ah, there it is! I was aware of the kinesh.h file for the Teensy 3.x but never knew the coralliary for the 4.x boards. I'll dive into that to see if I can understand how you selected all the low level code you added.

luni said:
Yes, but without locking you may run into issues when interrupting the pop code by some push code. In the posts above it is discussed that circular buffers should have no issues with this, but the implementation used here does. The current code from my github repo replaced the fully blown RingBuf.h buffer by a very simple implementation which doesn't need to disable interrupts during poping. I was able to reduce interrupt time to about 40-80ns (can't measure more accurate)

Got it. Now that I have some background on Ring Buffers, the discussion above makes much more sense to me.

luni said:
The line "using EdgeBuffer = RingBuf<uint32_t, 65000>;" is just a typedef to not always have to write RingBuf<uint32_t, 65000>. I.e., it simply defines the short name EdgeBuffer.

EdgeBuffer (or fully written RingBuf<uint32_t, 65000>) is the type (like int, float etc) and buffer is the name of the variable. Same as you have in float x = 3.0; float is the type and x is the name of the variable.

OK so EdgeBuffer is a nickname for the full datatype of RingBuf<uint32_t, 65000>. And that is a datatype because the RingBuf library used template meta programming.

luni said:
That's perfectly correct.

Nice

luni said:
The critical timing is the edgeProvider, this needs to be as fast as possible since it shouldn't miss edges. The decoding is not so critical since it can work async on the stored edges.

Makes sense

luni said:
yield is called whenever teensyduino is looping. I.e. it is called once per loop and e.g. while delay or other long running code is spinning. -> it is usually called more often than you have calls from the main loop. It also allows to use e.g. delay in the main loop without having to worry about not calling tick fast enough.

OK so this is specifically implemented in Teensyduino. That's why it is new to me coming from Arduino. Makes a lot of sense. Runs at least as often as loop() but potentially more often when blocking functions are running.

luni said:
I observed that from time to time some high priority interrupt delays the edge detection for about 1µs so that you'll get reading errors. Most of those will be caught but since I didn't implement the CRC some of those might pass. I wasn't able to find the actual interrupt source but the errror rate increases when you print something. So, probably related to the USB system...

I wonder if this is the same reason my early on code using periodic timer interrupts would glitch. It would seem like no matter what I did, every once in a while something would delay the execution of the interrupt by several microseconds.

luni said:
Next thing I want to try is the DMA path...

Wayyyy over my head, but I'd love to learn from what you do.

luni · Mar 7, 2022

Before looking, I'm just curious if you included protection for matching frame 1 to a subsequent frame 2. For example, if frame 1 arrives, then frame 2 is missed, and so are a bunch of other bursts, then a later frame 2 is captured, is that pair thrown away? Or more likely if the first packet it grabs is a frame 2 without a preceding frame 1, is it thrown away?

Yes, here is the code which makes sure that the frames are read in consecutively https://github.dev/luni64/mancheste...f53371210d/receiver/src/decoder.cpp#L107-L128.
It checks if the read frame is what you expected by looking at the frame# bit. If not it resets everything and tries again. If it successfully got frame 1 and frame 2 it passes back the two payloads for the necessary bit shuffling.

Whenever it finds an issue in a frame it starts again waiting for a frame 1. You can add your crc check in this part of the code at well.

Feel free to ask as many questions as you like, there are a lot of very good embedded programmers around here which love to help with interesting problems.

luni · Mar 7, 2022

I wonder if this is the same reason my early on code using periodic timer interrupts would glitch. It would seem like no matter what I did, every once in a while something would delay the execution of the interrupt by several microseconds.

Actually I was thinking the same. @PaulStoffregen: any idea which high priority (probably prio 0) interrupt runs for about 400-800 µs? It seems to be related to the CDC interface. But the only USB interrupts I see are IRQ_USB1 and IRQ_USB2 which both have a priority of 128 and should easily be interrupted by the edge interrupt which runs at priority 32. (Setting it to 0 doesn't help)

Here an example:

The interrupt should start at about the red arrow. It actually starts some 500ns late which is after the next edge and thus generates a timing error. It works correctly for 10th of thousands of edges but sometimes it happens. Not printing to Serial but to e.g. Serial2 seems to fix it.

Edit: could also be that something lower prio interrupt disables interrupts in its ISR somewhere

joepasquariello · Mar 7, 2022

macardoso said:
I think @luni was replying to my earlier statement regarding the need for this application to do things beyond just reading the serial steam. I'll actually need to track a quadrature encoder and respond to over/underflow interrupts (I think this is done in hardware) as well as read/write a second serial channel (2.5Mbaud, NRZ). Although that brings a good point... Does NoInterrupts() block serial data from reaching the serial buffer on a hardware serial port?

Disabling interrupts does not disable peripherals. The peripherals continue to operate and set event/interrupt flags, but interrupts are "masked", so the CPU does not process the event/interrupt until that mask is removed. With respect to reading a quadrature encoder, the quadrature counting goes on as normal regardless of whether interrupts are enabled or disabled. Whenever you get around to reading the running quadrature count, it will be correct. Be careful, though, because if your design is based on reading the quadrature count at precise intervals and then computing frequency or speed as counts/time, disabling interrupts will reduce the accuracy of the frequency or speed calculation by causing jitter in the time between reads. The same thing is true of serial. The peripheral continues to work while interrupts are disabled. As long as the disable time is short, where short means less than a single byte time, then no bytes will be lost. If the UART has a hardware FIFO, and there is room in the FIFO, then the interrupt disable time can be longer without losing data.

joepasquariello · Mar 7, 2022

luni said:
Actually I was thinking the same. @PaulStoffregen: any idea which high priority (probably prio 0) interrupt runs for about 400-800 µs? It seems to be related to the CDC interface. But the only USB interrupts I see are IRQ_USB1 and IRQ_USB2 which both have a priority of 128 and should easily be interrupted by the edge interrupt which runs at priority 32. (Setting it to 0 doesn't help)

Here an example:
View attachment 27765

The interrupt should start at about the red arrow. It actually starts some 500ns late which is after the next edge and thus generates a timing error. It works correctly for 10th of thousands of edges but sometimes it happens. Not printing to Serial but to e.g. Serial2 seems to fix it.

Edit: could also be that something lower prio interrupt disables interrupts in its ISR somewhere

Sharing some recent experience, I'm working on a T3.5 system with 10 kHz control frequency (100 us period). With default priority for the IntervalTimer, USB serial printing gets in the way of the timing, but with priority=0, max deviation time between interrupts is about 100 ns, over many hours, even with USB serial printing.

luni · Mar 7, 2022

Sharing some recent experience, I'm working on a T3.5 system with 10 kHz control frequency (100 us period). With default priority for the IntervalTimer, USB serial printing gets in the way of the timing, but with priority=0, max deviation time between interrupts is about 100 ns, over many hours.

Even 100ns seems strange maybe Paul knows what's running there with high priority or disabling interrupts when serial printing.

joepasquariello · Mar 7, 2022

luni said:
Even 100ns seems strange maybe Paul knows what's running there with high priority or disabling interrupts when serial printing.

I took the 100 ns to be minimum variation in response to interrupts, as opposed to interrupts being disabled. Will be curious to hear more on this. It's important to me to know how to avoid all disabling of interrupts if necessary. For example, T4 hardware serial has some disabling of interrupts, whereas T3 does not.

joepasquariello · Mar 7, 2022

luni said:
Even 100ns seems strange maybe Paul knows what's running there with high priority or disabling interrupts when serial printing.

@luni, I looked into interrupt latency on ARM Cortex, and I found the table below. T3.5 is M4, and I'm running at 120 MHz, so the max latency of 12 cycles is 12/120E6 = 0.1E-6 = 100 ns. I think that indicates that 100 ns is as good as I can expect on T3.5, and does not imply any interrupt disabling. Note that this specifies zero-wait-state memory, and I do have my ISR in on-chip RAM.

NVIC Cortex M interrupt latency (cycles) with zero wait state memory

Cortex-M0 16
Cortex-M0+ 15
Cortex-M3 12
Cortex-M4 12

EDIT: Latency is also 12 cycles for M7 (T4), so that would be 12/600E6 = 0.02E-6 = 20 ns

luni · Mar 8, 2022

DMA based decoder

I just uploaded a DMA based version. The DMA setup is surprisingly simple. Here the relevant code. Full version here (https://github.dev/luni64/manchesterCapture/blob/main/receiver/src/edgeproviderDMA.cpp)

Code:

    constexpr size_t bufSize = 256;                                         // DMA buffer for edge timestamps (512 bytes, 256 timestamps)
    uint16_t buf[bufSize] __attribute__((aligned(512)));                    // The DMA controller will replace the lowest n-bits of the address by a counter
                                                                            // to implement the circular buffer -> we need to align the start address of the buffer
    void EdgeProviderDMA::init()                                            // such that it corresponds to a countervalue of 0
    {                                                                       //
        *(portConfigRegister(11)) = 1;                                      // ALT1, use pin 11 as input to TMR1_2
                                                                            //
        ch->CTRL  = 0;                                                      // stop timer
        ch->SCTRL = TMR_SCTRL_CAPTURE_MODE(3);                              // both edges, enable edge interrupt
        ch->LOAD  = 0;                                                      // reload the counter with 0 at rollover (doesn't work without setting this explicitely)
        ch->DMA   = TMR_DMA_IEFDE;                                          // DMA on capture events
        ch->CTRL  = TMR_CTRL_CM(1) | TMR_CTRL_PCS(8 + 3) | TMR_CTRL_SCS(2); // start, source: peripheral clock, prescaler 3 (=> dt = 1/150Mhz * 8 = 53ns resolution, 2^15 * 53ns = 3.5ms max), use counter 2 input pin for capture
                                                                            //
        dmaChannel.begin();                                                 //
        dmaChannel.triggerAtHardwareEvent(DMAMUX_SOURCE_QTIMER1_READ2);     // trigger DMA by capture event on channel 2
        dmaChannel.source(ch->CAPT);                                        // DMA source = capture register (16 bit)
        dmaChannel.destinationCircular(buf, bufSize * sizeof(uint16_t));    // use a circular buffer as destination. Buffer size in bytes
        dmaChannel.enable();
    }

    uint16_t EdgeProviderDMA::popTimestamp()
    {
        return buf[tail++];
    }

    bool EdgeProviderDMA::hasElements()
    {
        return dmaChannel.destinationAddress() != (buf + tail);
    }

The DMA controller writes the captured timestamps into a simple 512 byte ring buffer from where the decoder reads and analyzes them in the same way as in the previous versions. This is running absolutely stable, I didn't detect any reading errors. The edge buffer is a bit small. I try to make that settable in later versions.

@macardoso Let me know if this version works for you.

I'm also thinking of making the DMA Manchester decoder a general purpose library and your TS5643 application a usage example. Manchester decoding pops up here about once per year. So, a library might be useful for a few. #

macardoso · Mar 8, 2022

luni said:

I just uploaded a DMA based version. The DMA setup is surprisingly simple. Here the relevant code. Full version here (https://github.dev/luni64/manchesterCapture/blob/main/receiver/src/edgeproviderDMA.cpp)

Code:

    constexpr size_t bufSize = 256;                                         // DMA buffer for edge timestamps (512 bytes, 256 timestamps)
    uint16_t buf[bufSize] __attribute__((aligned(512)));                    // The DMA controller will replace the lowest n-bits of the address by a counter
                                                                            // to implement the circular buffer -> we need to align the start address of the buffer
    void EdgeProviderDMA::init()                                            // such that it corresponds to a countervalue of 0
    {                                                                       //
        *(portConfigRegister(11)) = 1;                                      // ALT1, use pin 11 as input to TMR1_2
                                                                            //
        ch->CTRL  = 0;                                                      // stop timer
        ch->SCTRL = TMR_SCTRL_CAPTURE_MODE(3);                              // both edges, enable edge interrupt
        ch->LOAD  = 0;                                                      // reload the counter with 0 at rollover (doesn't work without setting this explicitely)
        ch->DMA   = TMR_DMA_IEFDE;                                          // DMA on capture events
        ch->CTRL  = TMR_CTRL_CM(1) | TMR_CTRL_PCS(8 + 3) | TMR_CTRL_SCS(2); // start, source: peripheral clock, prescaler 3 (=> dt = 1/150Mhz * 8 = 53ns resolution, 2^15 * 53ns = 3.5ms max), use counter 2 input pin for capture
                                                                            //
        dmaChannel.begin();                                                 //
        dmaChannel.triggerAtHardwareEvent(DMAMUX_SOURCE_QTIMER1_READ2);     // trigger DMA by capture event on channel 2
        dmaChannel.source(ch->CAPT);                                        // DMA source = capture register (16 bit)
        dmaChannel.destinationCircular(buf, bufSize * sizeof(uint16_t));    // use a circular buffer as destination. Buffer size in bytes
        dmaChannel.enable();
    }

    uint16_t EdgeProviderDMA::popTimestamp()
    {
        return buf[tail++];
    }

    bool EdgeProviderDMA::hasElements()
    {
        return dmaChannel.destinationAddress() != (buf + tail);
    }

The DMA controller writes the captured timestamps into a simple 512 byte ring buffer from where the decoder reads and analyzes them in the same way as in the previous versions. This is running absolutely stable, I didn't detect any reading errors. The edge buffer is a bit small. I try to make that settable in later versions.

@macardoso Let me know if this version works for you.

I'm also thinking of making the DMA Manchester decoder a general purpose library and your TS5643 application a usage example. Manchester decoding pops up here about once per year. So, a library might be useful for a few. #

Wow again!! Nice work. I am swamped with work so haven't had time to dig into all the stuff discussed above but I did upload this to my test setup really quickly and it works perfectly! I commented out the error printed to the serial port if the counter did not increment by 1 since my encoder is stationary, but it looks perfect!

I'd be interested to discuss what exactly the DMA is doing and what benefits it offers over the previous version of code.

For reference, here is my serial monitor (note the battery alarm bits are set since I do not have a 3.6V backup battery attached to the encoder on my test bench):

Code:

cnt: 312, BE+OS:1 OF:1 OS:1 BA:1 PS:0 CE:0
cnt: 312, BE+OS:1 OF:1 OS:1 BA:1 PS:0 CE:0
cnt: 312, BE+OS:1 OF:1 OS:1 BA:1 PS:0 CE:0
cnt: 312, BE+OS:1 OF:1 OS:1 BA:1 PS:0 CE:0
cnt: 312, BE+OS:1 OF:1 OS:1 BA:1 PS:0 CE:0
cnt: 312, BE+OS:1 OF:1 OS:1 BA:1 PS:0 CE:0
cnt: 312, BE+OS:1 OF:1 OS:1 BA:1 PS:0 CE:0
cnt: 312, BE+OS:1 OF:1 OS:1 BA:1 PS:0 CE:0
cnt: 312, BE+OS:1 OF:1 OS:1 BA:1 PS:0 CE:0
cnt: 312, BE+OS:1 OF:1 OS:1 BA:1 PS:0 CE:0
cnt: 312, BE+OS:1 OF:1 OS:1 BA:1 PS:0 CE:0
cnt: 312, BE+OS:1 OF:1 OS:1 BA:1 PS:0 CE:0

I'm really impressed by everyone and need to really dive in hard to learn all this stuff. Thanks again!

EDIT: I see no glitches either! Would love to know why

joepasquariello · Mar 8, 2022

@luni, wow, nice work.

macardoso said:
I'd be interested to discuss what exactly the DMA is doing and what benefits it offers over the previous version of code. I see no glitches either! Would love to know why.

Me, too. The DMA provider is so short, it seems like this would be a good example for DMA beginners.

luni · Mar 9, 2022

I'd be interested to discuss what exactly the DMA is doing and what benefits it offers over the previous version of code.

Me, too. The DMA provider is so short, it seems like this would be a good example for DMA beginners.

First, I'm no expert in DMA at all so take the following information with a grain of salt.
The DMA (direct memory access) controller is able to copy data from one location to another location using the chip hardware without involving the CPU. This usually is much faster than using the CPU which basically needs to load the source and destination addresses into registers, then copy the data from the source address into another register and from there to the destination address. Also, the DMA controller is not blocked by executing high priority interrupts. However, it needs to "steal" chip internal buses from the CPU to perform the actual copying. Depending on how many DMA channels want to transfer data simultaneously copy may take longer. AFAIK there is no guaranteed max time for the transfer but usually it is much faster than using the CPU.

Setup: Obviously, one needs to inform the DMA controller about the source and destination addresses, how many bytes to transfer and what shall trigger a transfer. There are a lot of options available and setting the thing up writing to the actual registers is quite involved. But Teensyduino provides the DMA_Channel class which takes care of the low level configuration and is not difficult to use. Here the example from above

Code:

1) dmaChannel.begin();                                                
2) dmaChannel.triggerAtHardwareEvent(DMAMUX_SOURCE_QTIMER1_READ2);     // trigger DMA by capture event on channel 2
3) dmaChannel.source(ch->CAPT);                                        // DMA source = capture register (16 bit)
4) dmaChannel.destinationCircular(buf, bufSize * sizeof(uint16_t));    // use a circular buffer as destination. Buffer size in bytes
5) dmaChannel.enable();

Basically requests one of the 32 DMA channels from the controller and initializes it.
Tells the controller to trigger a transfer whenever channel 2 of QTIMER1 (i.e. the third channel of TMR module 2) detects an event on its input pin.
Tells the controller to use the address of the capture register (CAPT) of this timer channel as source address.
Defines the destination address of the transfer. In this case we request that the controller should copy the data into our 512 byte ring buffer. The controller will automatically increase the destination address after each transfer and will roll over correctly.
Starts the thing.

There are a lot of options for source/destination and trigger settings. Best to look at the sources for the DMA_Channel class to investigate the options.

I see no glitches either! Would love to know why

As shown in the LA trace in #57 the glitches come from a late (~500ns) invocation of the ISR, probably due to some other high priority interrupt running at the same time. If the ISR is invoked after the counter detected the next edge, one edge is lost or at least completely wrong.
The DMA transfer of the capture register to the ring buffer is (usually) not delayed that much (even if interrupts are running). So, the chance of skipping one edge is much smaller.

Hope that helps

MarkT · Mar 9, 2022

On a complete aside, I notice the various source and destination buffer methods for DMA channels specify references:

Code:

	void destination(volatile signed int &p) { destination(*(volatile uint32_t *)&p); }
	void destination(volatile unsigned int &p) { destination(*(volatile uint32_t *)&p); }
	void destination(volatile signed long &p) { destination(*(volatile uint32_t *)&p); }
	void destination(volatile unsigned long &p) {

As I understand it you can't convert a pointer to a reference - And if you want to set up DMA channels generically you have a problem
as you only have a pointer value, not a reference to a static register.

Or is there a way to do this? The best I seem to be able to achieve is:

Code:

volatile uint16_t * valReg ;

// code that sets valReg programmatically here

dma.begin(true);
dma.sourceBuffer (pwm_dma_buffer, 2 * BUFSIZE * sizeof(uint16_t));
dma.destination ((uint16_t &)valReg);    // This cast allows it to compile but it doesn't pass the value of valReg pointer it seems.
dma.TCD->DADDR = valReg;    // extra line to fix the above destination() call which doesn't pass the address right (but sets the other fields in dma).

macardoso · Mar 9, 2022

Quick musing... How many copies of this cycle capture and DMA transfer could I conceivably do on one Teensy? I have a lot more application code to add on top of this for my project, so processor loading might be the limiting factor, but could I simultaneously sample and decode two asynchronous bitsteams? Three? Six? My original plan was one Teensy per motor (6 on my robot) but doing more than 1 per Teensy would definitely help on cost.

From a hardware standpoint, 6 motors = 18 high speed digital pins (6 manchester serial, 12 quadrature encoder) + 12 true serial pins (6 NRZ TX and 6 NRZ RX channels). So at least there are enough physical pins. The Teensy 4.1 has 8 hardware serial ports, so that's enough to cover the serial channels. 32 DMA channels should be sufficient since I only would need 6. Not sure about which pins support cycle capture and quadrature decoding (maybe all, maybe a limited subset).

Again I'm happy to use one Teensy per motor if it just works, but curious if there is something fundamentally in my way from expanding that. Or will I just run into more and more timing issues and rare glitches as I increase the load?

luni · Mar 9, 2022

MarkT said:
As I understand it you can't convert a pointer to a reference - And if you want to set up DMA channels generically you have a problem
as you only have a pointer value, not a reference to a static register.
Or is there a way to do this? The best I seem to be able to achieve is...

Does anything speak against a simple:

Code:

void setup()
{
    while (!Serial) {}

    DMAChannel dma;

    // code that sets valReg programmatically here
    volatile uint16_t* valReg = &IMXRT_TMR1.CH[2].CAPT;  // pointer to some register
   
    dma.begin(true);
    dma.destination(*valReg);

    Serial.printf("valReg %p\n", valReg);
    Serial.printf("DADDR  %p\n", dma.TCD->DADDR);

    pinMode(LED_BUILTIN, OUTPUT);
}

void loop()
{
}

The code prints:

Code:

valReg 0x401dc044
DADDR  0x401dc044

Which seems to be OK?

luni · Mar 9, 2022

macardoso said:
Quick musing... How many copies of this cycle capture and DMA transfer could I conceivably do on one Teensy? I have a lot more application code to add on top of this for my project, so processor loading might be the limiting factor, but could I simultaneously sample and decode two asynchronous bitsteams? Three? Six? My original plan was one Teensy per motor (6 on my robot) but doing more than 1 per Teensy would definitely help on cost.

From a hardware standpoint, 6 motors = 18 high speed digital pins (6 manchester serial, 12 quadrature encoder) + 12 true serial pins (6 NRZ TX and 6 NRZ RX channels). So at least there are enough physical pins. The Teensy 4.1 has 8 hardware serial ports, so that's enough to cover the serial channels. 32 DMA channels should be sufficient since I only would need 6. Not sure about which pins support cycle capture and quadrature decoding (maybe all, maybe a limited subset).

Again I'm happy to use one Teensy per motor if it just works, but curious if there is something fundamentally in my way from expanding that. Or will I just run into more and more timing issues and rare glitches as I increase the load?

My gut feeling is this will get unstable with 6 motors. Your 2MHz Manchester stream generates a lot of data which needs to be copied around and analyzed. Plus, there are only 4 HW quadrature decoders available. The pins are fixed (lib: https://github.com/mjs513/Teensy-4.x-Quad-Encoder-Library). So might be better to try with two boards handling 3 motors each. Don't know about these NRZ RX/TX channels. Are these normal serial ones?

Nice project by the way

macardoso · Mar 9, 2022

luni said:
My gut feeling is this will get unstable with 6 motors. Your 2MHz Manchester stream generates a lot of data which needs to be copied around and analyzed. Plus, there are only 4 HW quadrature decoders available. The pins are fixed (lib: https://github.com/mjs513/Teensy-4.x-Quad-Encoder-Library). So might be better to try with two boards handling 3 motors each. Don't know about these NRZ RX/TX channels. Are these normal serial ones?

Nice project by the way

Thanks! It is fun. This coding stuff is only a small part of a broader effort to collect information, get a 23 year old control running, and hopefully be able to run the robot.

I figured off the bat this really isn't reasonable to do all 6 motors through. Even 2 or 3 motors per board would significantly cut down on cost and motherboard size. The quadrature encoder input brings the same position data as the serial channel, but is incrementally measured, not absolute. In addition, with quadrature decoding, the quadrature signal carries 13 bits of single turn data rather than the 11 bits transmitted with the serial channel only. Also the position changes are streamed constantly rather than at periodic intervals with every other serial burst (84us delay). So there is a benefit to using both channels. I could just read the manchester serial once and track incrementally from there, but I think it might be better to cross check the two position registers against each other. The NRZ serial is a new serial channel which I haven't worked with yet. It is the RS422 2.5M baud connection to the servo drive. The servo drive requests position and status updates and the Teensy needs to formulate a proper reply based on the data coming from the encoder. This connection permits zero missed responses, so I need to carefully code it to prioritize the read/write of this channel over reading the manchester.

Basically the Teensy is emulating the format of an encoder the drive is expecting to talk to. That datasheet is attached if you are curious...

Next I'm going to figure out how to implement the CRC checking (I think it is a repetitive XOR of 1011 with the 18 data bits and verify the modulus matches the 3 remaining CRC bits at the end). Then I'll mess with the hardware encoder (I need it to rollover at 2^13 and trigger over/under flow interrupts to increment a multiturn counter. The encoder incremental track has 2^13 bits per turn and 2^13 multiturn counts... I guess that wouldn't overflow a 32 bit register... hrm.

Anyways I'll take the position data and status flags, bit shift the single turn data by 4 bits since the encoder I'm trying to emulate has 17 bits of single turn data (vs 13 on the one I have). Additionally the encoder I'm trying to emulate has some EEPROM for motor data which I need to emulate by responding to the drive with known data to make it think the encoder provided the requested EEPROM data. The encoder I'm emulating is a bit fancier than the one I have, but it is serial only, no quadrature signals.

Here is the datasheet for the encoder I am emulating. Split into 3 documents to deal with the file size restrictions
View attachment TS5669N124_spec_Reduced_Part1.pdf
View attachment TS5669N124_spec_Reduced_Part2.pdf
View attachment TS5669N124_spec_Reduced_Part3.pdf

Interrupt on Rising and Falling on the same pin

macardoso

Active member

macardoso

Active member

macardoso

Active member

luni

Well-known member

luni

Well-known member

joepasquariello

Well-known member

joepasquariello

Well-known member

luni

Well-known member

joepasquariello

Well-known member

joepasquariello

Well-known member

luni

Well-known member

macardoso

Active member

joepasquariello

Well-known member

luni

Well-known member

MarkT

Well-known member

macardoso

Active member

luni

Well-known member

luni

Well-known member

macardoso

Active member