Help! Program works on T3.5 but fails on T4.x (DCC++)

Status
Not open for further replies.

mjs513

Senior Member+
Was working on my DCC++ for Model Train Control with Teensy 3.x and Teensy 4.x as a result of problem that some one was running into with running on a T4.1. Going to get complicated so here it goes.

In DCC you have 2 tracks available - the main track where you run you DCC enabled trains and a second track (PROG) that allows you program DCC engines and run diagnostics on the engine. While running the train on the main track works without a problem on both the T3.5 and the T4.1 running diagnostic commands on the program track seems to hang the T4.1 somewhere - in this case I mean seems like its stuck in a infinite loop somewhere - and only way to get out of it is to restart. Just in case there was a hard fault somewhere I added in @Frank B's HardFault code but when the sketch failed no hard faults were indicated. Now for the details.

Using a Arduino Motor Shield (yes its compatible with 3.3v processors) even the current sense pins on my breakout board (well tested and been using for awhile) so I know its not that I went ahead and issued 2 commands:
First one is to turn on acknowledgement on the program track with a
Code:
<D ACK ON>
this seems to be working - I did have to change a bunch on int's to int16's since there was some conflict between using 16-bit (uno and mega) and a 32-bit teensy.

Second thing you have to do is to issue a
Code:
<R 8 1 1>
which sends a message to the DCC enabled loco to send by Manufactures ID.

Note: This is all interrupt driven since the message is encoded in the signal.

On a T3.5 I can send that command all day with an issue. On a T4.1 I may be able to send it once or twice and then it stops working and I can't figure out why - it looks like it should work. Here is what the output looks like when it works:
Code:
Auto Prog power on
ACK baseline=29/86mA Threshold=49/146mA Duration: 2000us <= pulse <= 8500us
V0 cv=8 bit=7DCCWaveform
DCCWaveform ack pending
Ack timeout
ackPending: 0
ACK PENDING FALSE: getACKackDetected: 0

NO-ACK after 145mS max=31/92mA pulse=0uS
V1 cv=8 bit=7DCCWaveform
DCCWaveform ack pending
6380
Got Ack
ACK PENDING FALSE: getACKackDetected: 1

ACK after 32mS max=163/487mA pulse=6380uS
V0 cv=8 bit=6DCCWaveform
DCCWaveform ack pending
6322
Got Ack
ACK PENDING FALSE: getACKackDetected: 1

ACK after 32mS max=155/463mA pulse=6322uS
V0 cv=8 bit=5DCCWaveform
DCCWaveform ack pending
6322
Got Ack
ACK PENDING FALSE: getACKackDetected: 1

ACK after 32mS max=147/439mA pulse=6322uS
V0 cv=8 bit=4DCCWaveform
DCCWaveform ack pending
6380
Got Ack
ACK PENDING FALSE: getACKackDetected: 1

ACK after 32mS max=163/487mA pulse=6380uS
V0 cv=8 bit=3DCCWaveform
DCCWaveform ack pending
6322
Got Ack
ACK PENDING FALSE: getACKackDetected: 1

ACK after 32mS max=159/475mA pulse=6322uS
V0 cv=8 bit=2DCCWaveform
DCCWaveform ack pending
6264
Got Ack
ACK PENDING FALSE: getACKackDetected: 1

ACK after 32mS max=144/430mA pulse=6264uS
V0 cv=8 bit=1DCCWaveform
DCCWaveform ack pending
6264
Got Ack
ACK PENDING FALSE: getACKackDetected: 1

ACK after 32mS max=164/490mA pulse=6264uS
V0 cv=8 bit=0DCCWaveform
DCCWaveform ack pending
Ack timeout
ackPending: 0
ACK PENDING FALSE: getACKackDetected: 0

NO-ACK after 145mS max=31/92mA pulse=0uS
VB cv=8 value=1296322
Got Ack
ACK PENDING FALSE: getACKackDetected: 1

ACK after 32mS max=165/493mA pulse=6322uS
Auto Prog power off
Callback(129)
<r1|1|8 129>

and when it fails (the I did add a bunch of serial prints since it seemed to hanging around there);
Code:
Auto Prog power on
ACK baseline=29/86mA Threshold=49/146mA Duration: 2000us <= pulse <= 8500us
V0 cv=8 bit=7DCCWaveform
DCCWaveform ack pending
Ack timeout
ackPending: 0

Basically what I was looking at is a flow something like this:
Code:
void DCCEXParser::parse(Print *stream, byte *com, bool blocking)
|
|--    case 'R': // READ CV ON PROG      //I am using <R 8 1 1>
        if (params == 3)
        { // <R CV CALLBACKNUM CALLBACKSUB>
            if (!stashCallback(stream, p))
                break;
            DCC::readCV(p[0], callback_R, blocking);
            return;
        }
		
		|
		|---void DCC::readCV(int16_t cv, ACK_CALLBACK callback, bool blocking)  {
			  ackManagerSetup(cv, 0,READ_CV_PROG, callback, blocking);
			}
			
			|
			|---void  DCC::ackManagerSetup(int16_t wordval, ackOp const program[], ACK_CALLBACK callback, bool blocking) {
				  ackManagerWord=wordval;
				  ackManagerProg = program;
				  ackManagerCallback = callback;
				  if (blocking) ackManagerLoop(blocking);
				}
				
				|
				|---void DCC::ackManagerLoop(bool blocking) {
					  while (ackManagerProg) {
						byte opcode=pgm_read_byte_near(ackManagerProg);
						.....
						  case V0:
						  case V1:      // Issue validate bit=0 or bit=1  packet
							{
						  if (checkResets(blocking, RESET_MIN)) return; 
							  if (Diag::ACK) DIAG(F("\nV%d cv=%d bit=%d"),opcode==V1, ackManagerCv,ackManagerBitNum); 
							  byte instruction = VERIFY_BIT | (opcode==V0?BIT_OFF:BIT_ON) | ackManagerBitNum;
							  byte message[] = {cv1(BIT_MANIPULATE, ackManagerCv), cv2(ackManagerCv), instruction };
						Serial.println("DCCWaveform");
							  DCCWaveform::progTrack.schedulePacket(message, sizeof(message), PROG_REPEATS);
						Serial.println("DCCWaveform ack pending");
							  DCCWaveform::progTrack.setAckPending(); 
							}
							break;
						
						|
						|--// Wait until there is no packet pending, then make this pending
							void DCCWaveform::schedulePacket(const byte buffer[], byte byteCount, byte repeats) {	
							void DCCWaveform::setAckPending() {  //sets ackPending = true
							
							//NOTE DCCWaveform is all part of the interrupt handler using Timer.h -- uses IntervalTimer
							|
							|--// process time-edge sensitive part of interrupt
								// return true if second level required
								bool DCCWaveform::interrupt1() {
								|
								|--  // ACK check is prog track only and will only be checked if
									 // this is not case(0) which needs  relatively expensive packet change code to be called.
									 if (ackPending) checkAck();
									 
									 |
									 | -- void DCCWaveform::checkAck() {
											// This function operates in interrupt() time so must be fast and can't DIAG
											if (sentResetsSincePacket > 6) {  //ACK timeout
										Serial.println("Ack timeout");
												ackCheckDuration=millis()-ackCheckStart;
												ackPending = false;
										Serial.printf("ackPending: %d\n", ackPending);
												return;
											}

											lastCurrent=motorDriver->getCurrentRaw();
											if (lastCurrent > ackMaxCurrent) ackMaxCurrent=lastCurrent;
											// An ACK is a pulse lasting between minAckPulseDuration and maxAckPulseDuration uSecs (refer @haba)

											if (lastCurrent>ackThreshold) {
											   if (ackPulseStart==0) ackPulseStart=micros();    // leading edge of pulse detected
											   return;
											}

											// not in pulse
											if (ackPulseStart==0)  return; // keep waiting for leading edge

											// detected trailing edge of pulse
											ackPulseDuration=micros()-ackPulseStart;
										Serial.println(ackPulseDuration);
											if (ackPulseDuration>=minAckPulseDuration && ackPulseDuration<=maxAckPulseDuration) {
												ackCheckDuration=millis()-ackCheckStart;
												ackDetected=true;
												ackPending=false;
												transmitRepeats=0;  // shortcut remaining repeat packets
										Serial.println("Got Ack");
												return;  // we have a genuine ACK result
											}
											ackPulseStart=0;  // We have detected a too-short or too-long pulse so ignore and wait for next leading edge
										}
 
Last edited:
Now looking at the waveforms between the T3.5 and T4.1 when it works the command takes about 1.2seconds to complete and they look alike (Used a Salae Logic Analyzer). When it fails on the T4.1 the signal just keeps going and doesn't cut off so I know it has to waiting for some thing.

Not sure if its an issue with using volatile or something with the second interrupt (main and prog track are controlled via 2 intervaltimers to get the right waveforms). Also used digitalwritefast and digitalreadfast no luck was originally was just digitalread/digitalwrite).

Here is a zip of the whole thing: View attachment CommandStation-EX.zip

UPDATE: This may help: View attachment CommandStation-EX.zip
 
Last edited:
Could it be a timing issue, Teensy 4.x going much faster?
In order to rule out, compile for a much lower speed similar to Teensy 3.x.
 
Could it be a timing issue, Teensy 4.x going much faster?
In order to rule out, compile for a much lower speed similar to Teensy 3.x.

Thanks thought crossed my mind - so I tested yesterday down to 24Mhz. But as a double check after your suggestion I just tested at 150Mhz - still stopped after a couple tries. Even tried tried changing compiler options
 
Good Morning!

I have not had a good chance to go through all of the code:
I don't see anywhere that you call attachInterruptVector, but if you did, I would probably make sure all of my interrupt vectors had something like: asm("dsb");

I have been bit by interrupt handlers being called more than once.

As you mentioned probably look for places that should have volatile

Also could be a simple timing issue that there is small window of time... Example I have seen (and done :eek: ) things like:

Put something on queue that will cause interrupt to happen.
Clear the flag that the ISR will clear
Wait for that flag to be set...

Maybe 999 out of 1000 times the code works fine. But then the 1000th case, right as you for example put something on UART to output that will generate the interrupt, but before your code has chance to clear the flag, an ISR comes in and by the time that ISR completes your byte was output, and the ISR was called and flag set, then returns to your main code that clears the flag and waits for that ISR that already happened...
Been there, done that :D
 
KurtE said:
I don't see anywhere that you call attachInterruptVector, but if you did, I would probably make sure all of my interrupt vectors had something like: asm("dsb");

I have been bit by interrupt handlers being called more than once.
Ok lets see if I can explain this without tripping over myself. A virtual timer is set up in Timer.h that calls off to IntervalTimer that does a attachInterruptVector. The sketch then calls off to:
Code:
  interruptTimer->attachInterrupt(interruptHandler);
which basically just points to the handler for IntervalTimer:
Code:
    void attachInterrupt(void (*isr)())  //From Time.h
    {
        //Serial.printf("attachInterrupt()\n");
        isrCallback = isr;
    }
don't think a attachInterrupVector is needed in this case.

It maybe timing or something with I need to make a few more things volatile.
 
Good morning again:

Just for the fun of it you might try adding the asm("dsb");
somewhere...

The reason I mention it, I have been bit several times where ISR gets triggered multiple times for the same interrupt.

And if I look at IntervalTimer.cpp I see: I removed the #if 0 to save space...
Code:
static void pit_isr()
{
	IMXRT_PIT_CHANNEL_t *channel= IMXRT_PIT_CHANNELS;
	if (funct_table[0] != nullptr && channel->TFLG) {channel->TFLG = 1;funct_table[0]();}
	channel++;
	if (funct_table[1] != nullptr && channel->TFLG) {channel->TFLG = 1;funct_table[1]();}
	channel++;
	if (funct_table[2] != nullptr && channel->TFLG) {channel->TFLG = 1;funct_table[2]();}
	channel++;
	if (funct_table[3] != nullptr && channel->TFLG) {channel->TFLG = 1;funct_table[3]();}
#endif
}
So it is not doing it... Again maybe a long shot... You could try adding it to your own or to the IntervalTmer ISR and see if makes a difference.
 
Good morning again:

Just for the fun of it you might try adding the asm("dsb");
somewhere...

The reason I mention it, I have been bit several times where ISR gets triggered multiple times for the same interrupt.

And if I look at IntervalTimer.cpp I see: I removed the #if 0 to save space...
Code:
static void pit_isr()
{
	IMXRT_PIT_CHANNEL_t *channel= IMXRT_PIT_CHANNELS;
	if (funct_table[0] != nullptr && channel->TFLG) {channel->TFLG = 1;funct_table[0]();}
	channel++;
	if (funct_table[1] != nullptr && channel->TFLG) {channel->TFLG = 1;funct_table[1]();}
	channel++;
	if (funct_table[2] != nullptr && channel->TFLG) {channel->TFLG = 1;funct_table[2]();}
	channel++;
	if (funct_table[3] != nullptr && channel->TFLG) {channel->TFLG = 1;funct_table[3]();}
#endif
}
So it is not doing it... Again maybe a long shot... You could try adding it to your own or to the IntervalTmer ISR and see if makes a difference.

Will give it a try - just after a fresh pot of coffee.
 
Ok now have another interesting thing going on.

I wanted to try with a Teensy 3.6 but as soon as turn power on with the sketch loaded the T3.6 is sending signals to the lock to move but its not suppose to.

The motor shield uses pins 11/13 for the prog track and 3/12 for the main track. Looks like may have to dig into the code more for the T3.6.
 
@KurtE
Did try and putting asm("dsb")'s in different places associated with the interrupts but still had the problem. Tried playing with volatile's as well but no luck.
 
The "dsb" is only effective for very short ISRs. The one used by the lib is probably long enough. (But this should be fixed anyway I'd say)

Did you try to look if the library calls the functions from timer.h after the startup? The code relies on the correct sequence (init -> setPeriod -> attachInterrupt -> start). I added printouts (using dual serial, to not disturb the communication with the controlling application). Might be interesting to observe if the lib calls those functions later and messes something up.

timer.h
Code:
#pragma once

#include "../VirtualTimer.h"
#include <Arduino.h>

class Timer : public VirtualTimer
{
 public:
    Timer(int timer_num) {}

    void initialize()
    {
        SerialUSB1.printf("init %p\n", timer);
        if (timer != nullptr) stop();
        timer = new IntervalTimer();
    }

    void setPeriod(unsigned long mu)
    {
        SerialUSB1.printf("setPeriod(%d)\n", mu);
        microseconds = mu;
    }
    void start()
    {
        SerialUSB1.printf("start\n");
        if (timer != nullptr && isrCallback != nullptr)
        {
            timer->begin(isrCallback, microseconds);
        }
        else
        {
            SerialUSB1.printf("start error");
        }
    }
    void stop()
    {
        SerialUSB1.printf("stop()\n");
        if (timer != nullptr)
        {
            SerialUSB1.printf("stop, not nullptr\n");
            timer->end();    //this will release the timer, I assume that the lib doesn't call start after stop without initializing.
            delete timer;
            timer = nullptr; // If so -> start would error. In case the lib wants to stop/start something more elaborate needs to be done here
        }
    }

    void attachInterrupt(void (*isr)())
    {
        SerialUSB1.printf("attachInterrupt()\n");
        isrCallback = isr;
    }

    void detachInterrupt()
    {
        SerialUSB1.println("detach");
        stop();
        isrCallback = nullptr;
    }

 private:
    uint32_t microseconds;
    IntervalTimer* timer = nullptr;
    void (*isrCallback)();
};

extern Timer TimerA;
extern Timer TimerB;
extern Timer TimerC;
extern Timer TimerD;

This is what it prints at startup
Code:
init 0x0
setPeriod(58)
attachInterrupt()
start
 
@luni
Seeing exactly the same thing on startup as you:
Code:
init 0x0
setPeriod(58)
attachInterrupt()
start
out of curiosity I switched the prog track to use pins 3/11 just in case. No difference
 
Yes, but does it print anything later while you are playing with the commands and getting the error? If so, the library might mess up some intervaltimer settings.
 
Yes, but does it print anything later while you are playing with the commands and getting the error? If so, the library might mess up some intervaltimer settings.

Nope nothing else gets printed as I am playing.

BTW will it work with a Teensy 3.x
Remember when I said it was working for a T3.5 well now its not working - argh. - no start up info from timer.h and no signal.
 
Weird. The library code seems to be quite elaborated. Would be surprised if it has such bugs in it. Maybe you changed something unintentionally while porting stuff?
 
Weird. The library code seems to be quite elaborated. Would be surprised if it has such bugs in it. Maybe you changed something unintentionally while porting stuff?

Funny it worked with the code I posted now must have did something to make it fail. Going to run a test just to see with time.h.

EDIT:
Something else must of broke. Just this test sketch and it works fine:
Code:
// Create an IntervalTimer object 
const int ledPin = LED_BUILTIN;  // the pin with a LED

#include "ArduinoTimers.h"
VirtualTimer * interruptTimer=NULL;

void setup() {
  pinMode(ledPin, OUTPUT);
  Serial.begin(9600);
  //myTimer.begin(blinkLED, 150000);  // blinkLED to run every 0.15 seconds

  interruptTimer= &TimerA; 
  interruptTimer->initialize();
  interruptTimer->setPeriod(58); // this is the 58uS DCC 1-bit waveform half-cycle
  interruptTimer->attachInterrupt(blinkLED);
  interruptTimer->start();
}

// The interrupt will blink the LED, and keep
// track of how many times it has blinked.
int ledState = LOW;
volatile unsigned long blinkCount = 0; // use volatile for shared variables

// functions called by IntervalTimer should be short, run as quickly as
// possible, and should avoid calling other functions if possible.
void blinkLED() {
  if (ledState == LOW) {
    ledState = HIGH;
    blinkCount = blinkCount + 1;  // increase when LED turns on
  } else {
    ledState = LOW;
  }
  digitalWrite(ledPin, ledState);
}

// The main program will print the blink count
// to the Arduino Serial Monitor
void loop() {
  unsigned long blinkCopy;  // holds a copy of the blinkCount

  // to read a variable which the interrupt code writes, we
  // must temporarily disable interrupts, to be sure it will
  // not change while we are reading.  To minimize the time
  // with interrupts off, just quickly make a copy, and then
  // use the copy while allowing the interrupt to keep working.
  noInterrupts();
  blinkCopy = blinkCount;
  interrupts();

  Serial.print("blinkCount = ");
  Serial.println(blinkCopy);
  delay(100);
}
 
Last edited:
@luni
Found the problem - the program for doesn't like going to USB1 when using a T3.5 and T3.6. Removed it and got everything started working as before. Did test with a T3.2 but it hanging somewhere so the DCC++ sketch doesn't work with the T3.2 - Not going to worry about that now though. Have to fix the problem with the T4.1 first.
 
Any luck?

Looks a bit complicated to jump in... I prefer easier things like MSC and MTP :D

Is it easy to setup and see failure when you don't have the hardware? Steps, and what to see when it fails versus OK?

I assume you have tried things like digitalWriteFast/digitalToggleFast around critical pieces of code to try to localize down issue?
 
Any luck?

Looks a bit complicated to jump in... I prefer easier things like MSC and MTP :D

Is it easy to setup and see failure when you don't have the hardware? Steps, and what to see when it fails versus OK?

I assume you have tried things like digitalWriteFast/digitalToggleFast around critical pieces of code to try to localize down issue?

Trying a whole host of things but no luck. Hate issues like this reminds me of UncannyEyes! Going to go back and play with MSC/MTP!
 
SOLVED:

Ok just to say thanks to everyone for helping and let you know that I finally tracked down the issue.

It appears that problem is using analogRead is not interrupt friendly.

The hang was in the DCCWaveform::checkPowerOverload specifically this line:
Code:
      lastCurrent = motorDriver->getCurrentRaw();

which gets called from loop in DCCWaveform. Disabling interrupts fixed the issue. So this is what it looks like now:
Code:
void DCCWaveform::loop() {
  noInterrupts();
  mainTrack.checkPowerOverload();
  progTrack.checkPowerOverload();
  interrupts();
}

Man that was a long drawn out process - thank goodness for the LA.
 
Glad you found the issue... :D
Wonder what is different between T3.x and T4.x?

Or does something in your interrupts also touch the Analog?
Sounds like maybe need test case to see and maybe the core code needs something to make it a little more robust!.

Man that was a long drawn out process - thank goodness for the LA.
Yep - Don't leave home without them :D
 
Morning @KurtE
Yeah me too!

within the interrupt it just calls off to analogRead - nothing fancy going on there. Was thinking about that myself especially since I have the test in post #19 as the base, but that will probably be later today - have some errands to run :)
 
Status
Not open for further replies.
Back
Top