TeensyCAN

Status
Not open for further replies.
Neat! I've got most of my stuff working now. Do you have any documents or explanations for what you mentioned above, where the nodes can discover each other and see if they are online and offline?

I had an interesting issue earlier. I was shorting the canbus out with pliers to test recovery from lost transmissions. Short duration cuts were okay, as soon as I cleared the short, everything came back. But then I held a long duration short, and when I removed it, it would not recover, until I rebooted the receiving node. Then it came back.

Is that normal?
 
thats normal because with 2 nodes on the bus the nodes couldn't self-recover if both went into passive error state, with no other traffic to acknowledge. ideally the bus is not supposed to be shorted especially for long periods. you can hack around this by forcing the ECR register to clear if the TX errors exceeded 128 which causes it to enter error passive state
 
Interesting. That explains why it recovered even when I rebooted the teensy that was receiving and never transmitting. It must have sent something out of the bus to make the other one come back to life.
I'll keep that in mind with the ECR register. Is there a library command to clear it out?

Also, if I am reading the logic correctly, any traffic flowing on the bus should cause all the nodes to spring back to life if that were to happen, yes?
 
yes, correct, for bus self recovery to work it must see valid frame bits on the bus, which will drop the TX overflow errors and the controller would go back online. There is no command for forced bus recover, but it wouldn't be hard to implement in a single function
 
I appear to be running into a new issue. I have added a third node and am stress testing the system to see how much traffic it can handle, and I seem to be crashing my master.

I can share source code if you think its an issue with my coding, I'll admit its very messy just while I learn.

The setup is as follows:

Node ID 1: Master, does not transmit anything, just receives. This is the one that crashes. Connects to can2.0 over can1 at 1000000 baud.

Node ID 2: Slave1, sends a 20 byte payload once per 1000ms, id 7.

Node ID3: Slave2, sends a 12 byte payload once per 33ms, id 8 (stress test to replicate how fast I'd like some of the data to go for gauges and whatnot in real time)

System works perfectly for approximately 10-60 seconds, until in the serial monitor for the master it receives a larger packet then is thought to be possible, and then crashes the teensy.

Normal packet from node2:
Code:
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 10226 ID: 1FFD4082 Buffer: 0 0 0 14 BE 0 FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 10374 ID: 1FFD8082 Buffer: 0 1 7 17 C8 70 2 4E 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 10517 ID: 1FFD8082 Buffer: 0 2 66 2E 42 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 10656 ID: 1FFD8082 Buffer: 0 3 41 EC 51 9A C4 64 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 10798 ID: 1FFDC082 Buffer: 0 4 20 F1 47 AA AA AA 
Node: 2	PacketID: 7	Broadcast: 0	Data: 23 200 112 2 78 102 46 66 148 135 69 65 236 81 154 196 100 32 241 71

Normal packet from node3:
Code:
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 55385 ID: 1FFD4083 Buffer: 0 0 0 C BE 11 FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 55532 ID: 1FFD8083 Buffer: 0 1 8 BE C8 A4 2 B1 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 55673 ID: 1FFD8083 Buffer: 0 2 5A 91 43 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 55813 ID: 1FFDC083 Buffer: 0 3 41 AA AA AA AA AA 
Node: 3	PacketID: 8	Broadcast: 0	Data: 190 200 164 2 177 90 145 67 148 135 69 65

Here is my console right before the crash. Ignore the other stuff, that's just testing to see if data is actually updating in a format I can read as it screams past.

Code:
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 36653 ID: 1FFD4083 Buffer: 0 0 0 C BE 87 FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 36799 ID: 1FFD8083 Buffer: 0 1 8 66 C8 E7 5 9B 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 36940 ID: 1FFD8083 Buffer: 0 2 E0 C 44 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 37081 ID: 1FFDC083 Buffer: 0 3 41 AA AA AA AA AA 
Node: 3	PacketID: 8	Broadcast: 0	Data: 102 200 231 5 155 224 12 68 148 135 69 65 
NODE 2 A: 81
NODE 2 B: 100.60
NODE 2 ELAPSED: 1459
NODE 3 A: 102
NODE 3 B: 563.51
NODE 3 ELAPSED: 1511
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 5117 ID: 1FFD4083 Buffer: 0 0 0 C BE 1B FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 5264 ID: 1FFD8083 Buffer: 0 1 8 67 C8 E7 5 1 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 5406 ID: 1FFD8083 Buffer: 0 2 E7 C 44 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 5546 ID: 1FFDC083 Buffer: 0 3 41 AA AA AA AA AA 
Node: 3	PacketID: 8	Broadcast: 0	Data: 103 200 231 5 1 231 12 68 148 135 69 65 
NODE 2 A: 81
NODE 2 B: 100.60
NODE 2 ELAPSED: 1459
NODE 3 A: 103
NODE 3 B: 563.61
NODE 3 ELAPSED: 1511
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 39116 ID: 1FFD4083 Buffer: 0 0 0 C BE 78 FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 39263 ID: 1FFD8083 Buffer: 0 1 8 68 C8 E7 5 67 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 39403 ID: 1FFD4082 Buffer: 0 0 0 14 BE C4 FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 39550 ID: 1FFD8083 Buffer: 0 2 ED C 44 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 39688 ID: 1FFD8082 Buffer: 0 1 7 52 C8 B4 5 E8 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 39828 ID: 1FFDC083 Buffer: 0 3 41 AA AA AA AA AA 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 39967 ID: 1FFD8082 Buffer: 0 2 65 C9 42 94 87 45 
Node: 3	PacketID: 8	Broadcast: 0	Data: 104 200 231 5 103 237 12 68 148 135 69 65 
NODE 2 A: 81
NODE 2 B: 100.60
NODE 2 ELAPSED: 1459
NODE 3 A: 104
NODE 3 B: 563.71
NODE 3 ELAPSED: 1511
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 40250 ID: 1FFD8082 Buffer: 0 3 41 EC 51 9A C4 64

It seems like normally, node 2 sends 5 frames at once, node 3 sends 4 frames at once. Then it receives something it perceives as one giant group of many more, and locks up. I guess in some tricky situations where the timing is juuuuuuust right, TeensyCAN library can lose track of which packets belong to what frame and just combines everything it hears into one giant packet? Thats what I'm understanding from the console dump above.

Is there anything I can do to stop this from happening? I feel like this quirk of timing can happen to any group of nodes, it just happens very quickly when I have one of them transmitting very quickly.

At first I thought that it was crashing because my code was receiving a larger then expected packet and it may have been overflowing, but this is the code I use to copy the buffer into the structure. I will be changing it to a state machine eventually, but for now this is my quick and dirty test:

Code:
if(info.node == 2 && info.packetid == 7) {  //if node is 2 and canbus packet id is 7, copy buffer to structure
    for(int i = 0; i < sizeof(testPacket_t); i++) {  //step through total size of buffer to write in data
      datatest.testPacket[i] = buffer[i]; //copy one by one
    }
  }

  if(info.node == 3 && info.packetid == 8) {  //if node is 2 and canbus packet id is 7, copy buffer to structure
    for(int i = 0; i < sizeof(testPacket2_t); i++) {  //step through total size of buffer to write in data
      datatest2.testPacket2[i] = buffer[i]; //copy one by one
    }
  }

As you can see above, at first I thought when the system gets a larger then expected packet, it would overflow the array and crash. But I realize that I'm setting my for loop count using the size of the specified structure, not the received data. So it should be impossible to overflow the byte array and only possible to simply just corrupt numbers. Unless writing a float incorrectly can also crash the teensy?

We are at the very edge of my understanding here haha..

I calculated that the worst case packet with bit stuffing and gaps for 6 transmission clocks results in me being able to send around 6200 full sized bit stuffed packets per second at 1000000baud. This is nowhere near that, at what I believe is only around.. 121-122 packets per second, considering that each node 2 transmission is 5 canbus frames @ 1/sec and each node 3 transmission is 4 canbus frames @ 30/sec.

Let me know if you have seen anything like this before. While I was typing this I set the node 3 speed down to just 5 per second and it did admittedly take a lot longer to crash but it did crash still, for the same reason, with a huge packet of data in the bottom.

Something else to make note of is that when I relaunch the master, both slaves start transmitting again happily, they survive this just fine.

Another observation is one node sending a 5 frame payload (20 bytes) at 100 sends per second (500 can frames per second) was left running for 16 hours and it didn't miss a single packet or crash at all. Only when I introduced a third node to the system has the severe instability started.

I will attach all 3 sketches on a following post.
 
Last edited:
Master (receiver, node 1)

Code:
//THIS IS NODE 1

#include <FlexCAN_T4.h>
#include <TeensyCAN.h>

bool debugMode = true;  //debug mode for extended sniffer data
bool heartBeat = false; //heartbeat led

//flexCan and TeensyCan config stuff
FlexCAN_T4<CAN1, RX_SIZE_256, TX_SIZE_16> Can0;
//data for other nodes
TeensyCAN node2 = TeensyCAN(2);  //other node

//testdata
uint8_t data[1024] = {
  1, 2, 3, 4, 5, 6, 7, 8, 9, 10
};
uint8_t data2[1024] = {
  11, 12, 13, 14, 15, 16, 17, 18, 19, 20
};

//fucking around with structures and unions
typedef struct testData_t{
  byte a;
  byte b;
  uint16_t c;
  float d;
  float e;
  float f;
  float g;
};

typedef union testPacket_t{
  testData_t test;
  byte testPacket[sizeof(testData_t)];
};

testPacket_t datatest;

typedef struct testData2_t{  //second packet set
  byte a;
  byte b;
  uint16_t c;
  float d;
  float e;
};

typedef union testPacket2_t{
  testData2_t test;
  byte testPacket2[sizeof(testData2_t)];
};

testPacket2_t datatest2;
//***************************


void setup() {
  Serial.begin(115200); delay(400);  //obvious
  Can0.begin(); //begin canbus
  Node.setID(1); //set THIS NODES ID
  Node.setBus(_CAN1); //set to canbus 1

  pinMode(13,OUTPUT);  //LED pin output

  Node.onReceive(cb);  //callback for teensyCAN library stuff
  Can0.setBaudRate(1000000);  //baud rate
  Can0.setMaxMB(16);  //max mailboxes??
  Can0.enableFIFO();  //enable fifo (important?)
  Can0.enableFIFOInterrupt(); //probably also important, was in the example
  Can0.onReceive(canSniff);  //callback for RAW canbus stuff
  Can0.enableMBInterrupts();  //also in example?
  Can0.mailboxStatus();       //also in example?

}

void loop() {
  Can0.events();
  Node.events();
  heartBeat = true;
  if(datatest.test.c % 2 == 0) {  //if time in seconds is even, flip it
    heartBeat = false;
  }
  digitalWrite(13,heartBeat); //write to led
}

void cb(const uint8_t* buffer, uint16_t length, AsyncTC info) {
  Serial.print("Node: ");
  Serial.print(info.node);
  Serial.print("\tPacketID: ");
  Serial.print(info.packetid);
  Serial.print("\tBroadcast: ");
  Serial.print(info.broadcast);
  Serial.print("\tData: ");
  for ( uint8_t i = 0; i < length; i++ ) {
    ::Serial.print(buffer[i]);
    ::Serial.print(" ");
  }::Serial.println();

  if(info.node == 2 && info.packetid == 7) {  //if node is 2 and canbus packet id is 7, copy buffer to structure
    for(int i = 0; i < sizeof(testPacket_t); i++) {  //step through total size of buffer to write in data
      datatest.testPacket[i] = buffer[i]; //copy one by one
    }
  }

  if(info.node == 3 && info.packetid == 8) {  //if node is 2 and canbus packet id is 7, copy buffer to structure
    for(int i = 0; i < sizeof(testPacket2_t); i++) {  //step through total size of buffer to write in data
      datatest2.testPacket2[i] = buffer[i]; //copy one by one
    }
  }

  Serial.print("NODE 2 A: "); Serial.println(datatest.test.a);  //test!
  Serial.print("NODE 2 B: "); Serial.println(datatest.test.d);  //test!
  Serial.print("NODE 2 ELAPSED: "); Serial.println(datatest.test.c);  //elapsed time in seconds

  Serial.print("NODE 3 A: "); Serial.println(datatest2.test.a);  //test!
  Serial.print("NODE 3 B: "); Serial.println(datatest2.test.d);  //test!
  Serial.print("NODE 3 ELAPSED: "); Serial.println(datatest2.test.c);  //elapsed time in seconds
  
}

void canSniff(const CAN_message_t &msg) {
  if(debugMode == 1) {
    Serial.print("ISR - MB "); Serial.print(msg.mb);
    Serial.print("  OVERRUN: "); Serial.print(msg.flags.overrun);
    Serial.print("  LEN: "); Serial.print(msg.len);
    Serial.print(" EXT: "); Serial.print(msg.flags.extended);
    Serial.print(" TS: "); Serial.print(msg.timestamp);
    Serial.print(" ID: "); Serial.print(msg.id, HEX);
    Serial.print(" Buffer: ");
    for ( uint8_t i = 0; i < msg.len; i++ ) {
      Serial.print(msg.buf[i], HEX); Serial.print(" ");
    } Serial.println();
  }
}

Slave1 (slow transmitter, node 2)

Code:
//THIS IS NODE 2

#include <FlexCAN_T4.h>
#include <TeensyCAN.h>

bool debugMode = false;  //debug mode for extended sniffer data
bool heartBeat = false;  //heartbeat flag
static uint32_t t = millis();
uint8_t x = 0;
float y = 0;
byte floatArray[4];


//fucking around with structures and unions
typedef struct testData_t{
  byte a;
  byte b;
  uint16_t c;
  float d;
  float e;
  float f;
  float g;
};

typedef union testPacket_t{
  testData_t test;
  byte testPacket[sizeof(testData_t)];
};

testPacket_t datatest;
//***************************
  
//flexCan and TeensyCan config stuff
FlexCAN_T4<CAN1, RX_SIZE_256, TX_SIZE_16> Can0;
//data for other nodes
TeensyCAN node1 = TeensyCAN(1);  //other nodes
TeensyCAN node8 = TeensyCAN(8); //invalid node on purpose


//testdata
uint8_t data[16] = {
  1, 2, 3, 4, 5, 6, 7, 8, 9, 10
};
uint8_t data2[16] = {
  11, 12, 13, 14, 15, 16, 17, 18, 19, 20
};

void setup() {
  Serial.begin(115200); delay(400);  //obvious
  pinMode(13,OUTPUT);  //led pinmode
  Can0.begin(); //begin canbus
  Node.setID(2); //set THIS NODES ID
  Node.setBus(_CAN1); //set to canbus 1

  Node.onReceive(cb);  //callback for teensyCAN library stuff
  Can0.setBaudRate(1000000);  //baud rate
  Can0.setMaxMB(16);  //max mailboxes??
  Can0.enableFIFO();  //enable fifo (important?)
  Can0.enableFIFOInterrupt(); //probably also important, was in the example
  Can0.onReceive(canSniff);  //callback for RAW canbus stuff
  Can0.enableMBInterrupts();  //also in example?
  Can0.mailboxStatus();       //also in example?

  datatest.test.a = 100;  //set some test variables
  datatest.test.b = 200;
  datatest.test.c = 12345;
  datatest.test.d = 0.1;
  datatest.test.e = 12.3456;
  datatest.test.f = -1234.56;
  datatest.test.g = 123456.78;
}

void loop() {
  Can0.events();
  Node.events();
  
  heartBeat = true;
  if(datatest.test.c % 2 == 0) {  //if time in seconds is even, flip it
    heartBeat = false;
  }
  digitalWrite(13,heartBeat); //write to led

  if ( millis() - t > 1000 ) {
/*
  data[0] = x;
  Serial.print("NODE1 ACK: "); Serial.println(node1.sendMsg(data, 10, 7));  //send ten bytes to node 1, with a canbus ID of 7
//  Serial.print("NODE8 ACK: "); Serial.println(node8.sendMsg(data2, 10, 99));  //send ten bytes to node 8, with a canbus ID of 99
  Serial.print("GLOBAL ACK: "); Serial.println(Node.sendMsg(data2, 10, 99));  //broadcasts 10 bytes wide open, with a canbus ID of 99
  x++;
  if(x > 7) {
    x = 0;
  }
  */

  
  
  datatest.test.a++;  //increment test.a
  datatest.test.d = datatest.test.d + 0.1;  //increment test.d
  datatest.test.c = millis() / 1000;  //divide millis by 1000 for elapsed seconds since boot

  Serial.print("NODE1 ACK: "); Serial.println(node1.sendMsg(datatest.testPacket, sizeof(testData_t), 7));  //send 20? byte array to node 1, with a canbus ID of 7
/*  
  Serial.print("DATA SENT: ");
  for(int i = 0; i < sizeof(testData_t); i++) {
    Serial.print(datatest.testPacket[i]);
    Serial.print(" ");
  }
  Serial.println();
  */

  Serial.print("FLOAT VAR: "); Serial.println(datatest.test.d);
  Serial.print("SECONDS RUNNING: "); Serial.println(datatest.test.c);  //elapsed time
  
  t = millis();
  }
  
}


void cb(const uint8_t* buffer, uint16_t length, AsyncTC info) {
  Serial.print("Node: ");
  Serial.print(info.node);
  Serial.print("\tPacketID: ");
  Serial.print(info.packetid);
  Serial.print("\tBroadcast: ");
  Serial.print(info.broadcast);
  Serial.print("\tData: ");
  for ( uint8_t i = 0; i < length; i++ ) {
    ::Serial.print(buffer[i]);
    ::Serial.print(" ");
  }::Serial.println();
}

void canSniff(const CAN_message_t &msg) {
  if(debugMode == true) {
    Serial.print("ISR - MB "); Serial.print(msg.mb);
    Serial.print("  OVERRUN: "); Serial.print(msg.flags.overrun);
    Serial.print("  LEN: "); Serial.print(msg.len);
    Serial.print(" EXT: "); Serial.print(msg.flags.extended);
    Serial.print(" TS: "); Serial.print(msg.timestamp);
    Serial.print(" ID: "); Serial.print(msg.id, HEX);
    Serial.print(" Buffer: ");
    for ( uint8_t i = 0; i < msg.len; i++ ) {
      Serial.print(msg.buf[i], HEX); Serial.print(" ");
    } Serial.println();
  }
}

Slave 2 (fast transmitter, node 3)

Code:
//THIS IS NODE 3

#include <FlexCAN_T4.h>
#include <TeensyCAN.h>

bool debugMode = false;  //debug mode for extended sniffer data
bool heartBeat = false;  //heartbeat flag
static uint32_t t = millis();
uint8_t x = 0;
float y = 0;
byte floatArray[4];


//fucking around with structures and unions
typedef struct testData_t{
  byte a;
  byte b;
  uint16_t c;
  float d;
  float e;
};

typedef union testPacket_t{
  testData_t test;
  byte testPacket[sizeof(testData_t)];
};

testPacket_t datatest;
//***************************
  
//flexCan and TeensyCan config stuff
FlexCAN_T4<CAN1, RX_SIZE_256, TX_SIZE_16> Can0;
//data for other nodes
TeensyCAN node1 = TeensyCAN(1);  //other nodes
TeensyCAN node8 = TeensyCAN(8); //invalid node on purpose


//testdata
uint8_t data[16] = {
  1, 2, 3, 4, 5, 6, 7, 8, 9, 10
};
uint8_t data2[16] = {
  11, 12, 13, 14, 15, 16, 17, 18, 19, 20
};

void setup() {
  Serial.begin(115200); delay(400);  //obvious
  pinMode(13,OUTPUT);  //led pinmode
  Can0.begin(); //begin canbus
  Node.setID(3); //set THIS NODES ID
  Node.setBus(_CAN1); //set to canbus 1

  Node.onReceive(cb);  //callback for teensyCAN library stuff
  Can0.setBaudRate(1000000);  //baud rate
  Can0.setMaxMB(16);  //max mailboxes??
  Can0.enableFIFO();  //enable fifo (important?)
  Can0.enableFIFOInterrupt(); //probably also important, was in the example
  Can0.onReceive(canSniff);  //callback for RAW canbus stuff
  Can0.enableMBInterrupts();  //also in example?
  Can0.mailboxStatus();       //also in example?

  datatest.test.a = 100;  //set some test variables
  datatest.test.b = 200;
  datatest.test.c = 12345;
  datatest.test.d = 0.1;
  datatest.test.e = 12.3456;
}

void loop() {
  Can0.events();
  Node.events();
  
  heartBeat = true;
  if(datatest.test.c % 2 == 0) {  //if time in seconds is even, flip it
    heartBeat = false;
  }
  digitalWrite(13,heartBeat); //write to led

  if ( millis() - t > 33 ) {
  
  datatest.test.a++;  //increment test.a
  datatest.test.d = datatest.test.d + 0.1;  //increment test.d
  datatest.test.c = millis() / 1000;  //divide millis by 1000 for elapsed seconds since boot

  Serial.print("NODE1 ACK: "); Serial.println(node1.sendMsg(datatest.testPacket, sizeof(testData_t), 8));  //send 12? byte array to node 1, with a canbus ID of 8
/*  
  Serial.print("DATA SENT: ");
  for(int i = 0; i < sizeof(testData_t); i++) {
    Serial.print(datatest.testPacket[i]);
    Serial.print(" ");
  }
  Serial.println();
  */

  Serial.print("FLOAT VAR: "); Serial.println(datatest.test.d);
  Serial.print("SECONDS RUNNING: "); Serial.println(datatest.test.c);  //elapsed time
  
  t = millis();
  }
  
}


void cb(const uint8_t* buffer, uint16_t length, AsyncTC info) {
  Serial.print("Node: ");
  Serial.print(info.node);
  Serial.print("\tPacketID: ");
  Serial.print(info.packetid);
  Serial.print("\tBroadcast: ");
  Serial.print(info.broadcast);
  Serial.print("\tData: ");
  for ( uint8_t i = 0; i < length; i++ ) {
    ::Serial.print(buffer[i]);
    ::Serial.print(" ");
  }::Serial.println();
}

void canSniff(const CAN_message_t &msg) {
  if(debugMode == true) {
    Serial.print("ISR - MB "); Serial.print(msg.mb);
    Serial.print("  OVERRUN: "); Serial.print(msg.flags.overrun);
    Serial.print("  LEN: "); Serial.print(msg.len);
    Serial.print(" EXT: "); Serial.print(msg.flags.extended);
    Serial.print(" TS: "); Serial.print(msg.timestamp);
    Serial.print(" ID: "); Serial.print(msg.id, HEX);
    Serial.print(" Buffer: ");
    for ( uint8_t i = 0; i < msg.len; i++ ) {
      Serial.print(msg.buf[i], HEX); Serial.print(" ");
    } Serial.println();
  }
}
 
check the constructor TX_SIZE, keep it a power of 2, it's gotta be able to be big enough to support your array size in 8 bit chunks. 8x16 = 128bytes max, have you tried normal one dimensional arrays rather than structs?
 
Okay, I'll increase it to 128 or 256 on all three and see what happens. Why would the TX affect the master receiving, anyway? I'm not 100% sure how that works.
 
I upped the TX_SIZE on all 3 sketches to 256 and it lasted a bit longer but eventually, died all the same. Here is my console output on my master before it froze

Code:
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 55007 ID: 1FFC0003 Buffer: 3 8 10 0 B B8 0 0 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 473 ID: 1FFD4083 Buffer: 0 0 0 C BE BF FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 620 ID: 1FFD8083 Buffer: 0 1 8 FA C8 42 0 7B 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 763 ID: 1FFD8083 Buffer: 0 2 4D 42 43 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 903 ID: 1FFDC083 Buffer: 0 3 41 AA AA AA AA AA 
Node: 3	PacketID: 8	Broadcast: 0	Data: 250 200 66 0 123 77 66 67 148 135 69 65 
NODE 2 A: 184
NODE 2 B: 8.50
NODE 2 ELAPSED: 84
NODE 3 A: 250
NODE 3 B: 194.30
NODE 3 ELAPSED: 66
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 34472 ID: 1FFD4083 Buffer: 0 0 0 C BE FA FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 34620 ID: 1FFD8083 Buffer: 0 1 8 FB C8 42 0 15 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 34762 ID: 1FFD8083 Buffer: 0 2 67 42 43 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 34901 ID: 1FFDC083 Buffer: 0 3 41 AA AA AA AA AA 
Node: 3	PacketID: 8	Broadcast: 0	Data: 251 200 66 0 21 103 66 67 148 135 69 65 
NODE 2 A: 184
NODE 2 B: 8.50
NODE 2 ELAPSED: 84
NODE 3 A: 251
NODE 3 B: 194.40
NODE 3 ELAPSED: 66
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 2936 ID: 1FFD4083 Buffer: 0 0 0 C BE A0 FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 3084 ID: 1FFD8083 Buffer: 0 1 8 FC C8 42 0 AF 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 3226 ID: 1FFD8083 Buffer: 0 2 80 42 43 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 3366 ID: 1FFDC083 Buffer: 0 3 41 AA AA AA AA AA 
Node: 3	PacketID: 8	Broadcast: 0	Data: 252 200 66 0 175 128 66 67 148 135 69 65 
NODE 2 A: 184
NODE 2 B: 8.50
NODE 2 ELAPSED: 84
NODE 3 A: 252
NODE 3 B: 194.50
NODE 3 ELAPSED: 66
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 36935 ID: 1FFD4083 Buffer: 0 0 0 C BE 5D FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 37083 ID: 1FFD8083 Buffer: 0 1 8 FD C8 42 0 49 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 37226 ID: 1FFD8083 Buffer: 0 2 9A 42 43 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 37365 ID: 1FFDC083 Buffer: 0 3 41 AA AA AA AA AA 
Node: 3	PacketID: 8	Broadcast: 0	Data: 253 200 66 0 73 154 66 67 148 135 69 65 
NODE 2 A: 184
NODE 2 B: 8.50
NODE 2 ELAPSED: 84
NODE 3 A: 253
NODE 3 B: 194.60
NODE 3 ELAPSED: 66
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 5356 ID: 1FFD4082 Buffer: 0 0 0 14 BE 8A FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 5501 ID: 1FFD4083 Buffer: 0 0 0 C BE DD FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 5646 ID: 1FFD8082 Buffer: 0 1 7 B9 C8 55 0 96 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 5787 ID: 1FFD8083 Buffer: 0 1 8 FE C8 42 0 E3 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 5928 ID: 1FFD8082 Buffer: 0 2 99 9 41 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 6067 ID: 1FFD8083 Buffer: 0 2 B3 42 43 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 6205 ID: 1FFD8082 Buffer: 0 3 41 EC 51 9A C4 64 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 6346 ID: 1FFDC083 Buffer: 0 3 41 AA AA AA AA AA

I think I can see that every time node3 sends stuff, the last 3 digits of the ID are 083. and when node 2 sends stuff, its 082. Then when it crashes, I can see 083 and 082 mixed together, which means I guess they managed to sneak their packets between each other? Which I'm not sure how that's possible, given that node2 is sending with an ID of 7 and node 3 is sending with an ID of 8, which means 7 should always win..
 
not sure, skpang used it to transfer images (pictures) over CAN to display on a remote LCD, so that being a higher size of bytes I am thinking its a code issue as well :p
 
Huh the forum appears to have posted some stuff a bit out of order.. I'm breaking everything today :D

I will switch both slave nodes to broadcast and see what happens.
 
Both slaves are now spamming a lot of received packet data on the their consoles but seem to stay running quite happily. The master still crashes after a dozen or so seconds and displays a huge impossibly sized packet before doing so.

Code:
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 2868 ID: 1FFD4003 Buffer: 0 0 0 C BE 54 FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 3015 ID: 1FFD8003 Buffer: 0 1 8 2 C8 E 0 F0 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 3159 ID: 1FFD8003 Buffer: 0 2 FF 25 42 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 3299 ID: 1FFDC003 Buffer: 0 3 41 AA AA AA AA AA 
Node: 3	PacketID: 8	Broadcast: 1	Data: 2 200 14 0 240 255 37 66 148 135 69 65 
NODE 2 A: 102
NODE 2 B: 0.30
NODE 2 ELAPSED: 2
NODE 3 A: 2
NODE 3 B: 41.50
NODE 3 ELAPSED: 14
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 36867 ID: 1FFD4003 Buffer: 0 0 0 C BE 69 FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 37013 ID: 1FFD8003 Buffer: 0 1 8 3 C8 E 0 56 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 37156 ID: 1FFD8003 Buffer: 0 2 66 26 42 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 37295 ID: 1FFDC003 Buffer: 0 3 41 AA AA AA AA AA 
Node: 3	PacketID: 8	Broadcast: 1	Data: 3 200 14 0 86 102 38 66 148 135 69 65 
NODE 2 A: 102
NODE 2 B: 0.30
NODE 2 ELAPSED: 2
NODE 3 A: 3
NODE 3 B: 41.60
NODE 3 ELAPSED: 14
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 5331 ID: 1FFD4003 Buffer: 0 0 0 C BE 2E FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 5478 ID: 1FFD8003 Buffer: 0 1 8 4 C8 E 0 BC 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 5622 ID: 1FFD8003 Buffer: 0 2 CC 26 42 94 87 45 
Node: 3	PacketID: 8	Broadcast: 1	Data: 4 200 14 0 188 204 38 66 148 135 69 65 
NODE 2 A: 102
NODE 2 B: 0.30
NODE 2 ELAPSED: 2
NODE 3 A: 4
NODE 3 B: 41.70
NODE 3 ELAPSED: 14
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 5761 ID: 1FFDC003 Buffer: 0 3 41 AA AA AA AA AA 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 39330 ID: 1FFD4003 Buffer: 0 0 0 C BE 4F FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 39476 ID: 1FFD8003 Buffer: 0 1 8 5 C8 E 0 22 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 39619 ID: 1FFD8003 Buffer: 0 2 33 27 42 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 39759 ID: 1FFDC003 Buffer: 0 3 41 AA AA AA AA AA 
Node: 3	PacketID: 8	Broadcast: 1	Data: 5 200 14 0 34 51 39 66 148 135 69 65 
NODE 2 A: 102
NODE 2 B: 0.30
NODE 2 ELAPSED: 2
NODE 3 A: 5
NODE 3 B: 41.80
NODE 3 ELAPSED: 14
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 7793 ID: 1FFD4003 Buffer: 0 0 0 C BE 4C FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 7939 ID: 1FFD4002 Buffer: 0 0 0 14 BE B6 FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 8084 ID: 1FFD8003 Buffer: 0 1 8 6 C8 E 0 88 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 8225 ID: 1FFD8002 Buffer: 0 1 7 67 C8 3 0 CD 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 8367 ID: 1FFD8003 Buffer: 0 2 99 27 42 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 8505 ID: 1FFD8002 Buffer: 0 2 CC CC 3E 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 8645 ID: 1FFDC003 Buffer: 0 3 41 AA AA AA AA AA 
Node: 3	PacketID: 8	Broadcast: 1	Data: 6 200 14 0 136 153 39 66 148 135 69 65 
NODE 2 A: 102
NODE 2 B: 0.30
NODE 2 ELAPSED: 2
NODE 3 A: 6
NODE 3 B: 41.90
NODE 3 ELAPSED: 14
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 8784 ID: 1FFD8002 Buffer: 0 3 41 EC 51 9A C4 64
 
maybe theres an issue with the displacement of 2 transfers, can you try with only one node to see if it still crashes master?
 
Oh yeah that's not a problem. As I said in an earlier post, before I added node 3, I just had node 2 sending 20 byte payloads (5 can bus frames) 100 times per second. I did it for 16 hours by leaving it on overnight. In the morning, everything was still working perfectly, it had not dropped a single packet. (I had a counter going on the master and the slave and after 16 hours, 36 minutes or so, and 5,978,534 packets, they both lined up, not a single drop.

It seems like something bad is happening when there is two nodes that happen to try to transmit at the same time. Which is a shame, because I'd be a bit bummed to have to gum up the works by having to make sure the nodes maintain some kind of coordination and don't try to step on each other. After all, this is the whole reason I wanted to use can bus.

The only thing setting everything to broadcast has done so far is make the timeout lag / stuttering on the slaves when the master dies no longer a factor. Which is nice, actually.
 
yeah once they drop off the active list, transfers to them are blocked, the timeouts made me work on that to prevent delays caused by missing nodes by making sure theyre on the list before sending
 
That explains why they recover to their usual speed on the heartbeat led after a little while.

So all in all, do you think this is an error in my programming, or a potential bug in the teensyCAN library?

Thank you for answering all my questions so quickly, by the way!
 
its probably an overlap between nodes that would need to be fixed, on the reassembly side (receiver). if it locked up, it doesnt tell me if it was an array overlap or a memmove transfer injected to wrong array, its been awhile since i made that library

it wouldnt be a memmove issue since it's properly injected to the array indices the DLC counts, however, it may be something weird like the way it pulls the queued incomplete array and memmove shifts the dataset in, it has to put it back in queue, unless i modified the queue slot directly (which is possible to edit queues directly without popping them using the circular array buffer library), i just dont remember how i done it been awhile, eventually ill look into it, plus theres the interrupts (which are required for listening while sending, may be a conflict when a queue is complete and another one fires over the process as well

did you try flooding the network with one node and one master? can you make it crash by flooding? if it can handle flooding, then for sure its a problem with teensycan the way it handles multiple queues
 
Last edited:
Ok! As I want to avoid having a central node be in charge of scheduling, I will switch away from this library and use flexcan directly and break my data up into more, smaller chunks to avoid needing to send large transfers at the same time.

Thank you for the help! I will keep my eye open in case any fixes get made.

On that note though, is there anything I can run on my receiver node that can determine for you what the lock up is?
 
if the lockup is from callback or loop(), serial prints usually find where the last line locked up. if its from an isr then prints wont work unless you put them in library

if you are only doing 12 bytes, you can do CANFD without any library as it can do up to 64 data bytes, teensy 4.1 has the pins broken out

to me the lockup is somewhere at reassembly stage, if you see the data bytes, 0,1,2,3 are the sectioned data of your messages. try this. send an array of 2 bytes in your 3 node setup (not 20), 2 nodes sending 2 bytes, if they can be on a single message they shouldnt crash at all, this will show if its a queue management issue
 
Just tried putting a repeating serial println in the main loop, it locks up too when a larger transfer comes in.

12 bytes is just my testing, I imagine the maximum will likely be much much larger than that. When one of the computers tries to transmit a large table of waypoint lat/lon pairs, for example.

Also, yeah, unfortunately I already bought 8 teensy 4.0's, so I'm kind of stuck with them for now, haha.

If by flooding you mean just sending as fast as possible, I have not tried that but I will try that now. I'll just remove the delay entirely to force the single slave transmitter to maximum speed.
 
I am currently uncapped sending data from a single node at around 999 transfers of 12 bytes (4 can frames) per second. It's been running for some time now and seems to be working well.

I will try to do your suggestion now. apologies for not realizing what you say right away, the forum provides no notification for edited posts ;)
 
So.. interesting.

I stopped sending dynamic data and instead sent static data in tables, and limited to 2 bytes as you requested. It sends 2 canbus messages per frame for each message, and I do see them sometimes overlapping messages but it never crashes.

To test my code, I changed the targeted ID's of my buffer copy code to nodes 20 and 30 instead of 2 and 3, so that it will always be skipped and will never try to copy data as it never sees those nodes on the bus. When sending the large packets, it crashes.

To be extra sure, I commented out my code that copies the buffers to the structs entirely. Took a little longer to crash, but it still does. I am no longer performing any write operations to variables in memory that I control and am just receiving packets from 2 sources at once. It still crashes, however. In this console output, you'll note that the node 2 and 3 variables are zero, because as stated, the code that copies that information to the variable byte arrays is commented out entirely. Nothing happens to the packets once they are received other then the callback pasting them to the console to show it got them.

Code:
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 18406 ID: 1FFD4083 Buffer: 0 0 0 C BE F8 FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 18554 ID: 1FFD8083 Buffer: 0 1 8 9D C8 48 1 85 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 18695 ID: 1FFD8083 Buffer: 0 2 DA 7 44 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 18835 ID: 1FFDC083 Buffer: 0 3 41 AA AA AA AA AA 
Node: 3	PacketID: 8	Broadcast: 0	Data: 157 200 72 1 133 218 7 68 148 135 69 65 
NODE 2 A: 0
NODE 2 B: 0.00
NODE 2 ELAPSED: 0
NODE 3 A: 0
NODE 3 B: 0.00
NODE 3 ELAPSED: 0
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 33429 ID: 1FFC0002 Buffer: 2 8 10 0 B B8 0 0 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 52405 ID: 1FFD4083 Buffer: 0 0 0 C BE AF FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 52552 ID: 1FFD8083 Buffer: 0 1 8 9E C8 48 1 EB 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 52694 ID: 1FFD8083 Buffer: 0 2 E0 7 44 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 52835 ID: 1FFDC083 Buffer: 0 3 41 AA AA AA AA AA 
Node: 3	PacketID: 8	Broadcast: 0	Data: 158 200 72 1 235 224 7 68 148 135 69 65 
NODE 2 A: 0
NODE 2 B: 0.00
NODE 2 ELAPSED: 0
NODE 3 A: 0
NODE 3 B: 0.00
NODE 3 ELAPSED: 0
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 20869 ID: 1FFD4083 Buffer: 0 0 0 C BE 13 FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 21016 ID: 1FFD4082 Buffer: 0 0 0 14 BE 62 FF FE 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 21161 ID: 1FFD8083 Buffer: 0 1 8 9F C8 48 1 51 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 21302 ID: 1FFD8082 Buffer: 0 1 7 A C8 BC 0 A8 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 21444 ID: 1FFD8083 Buffer: 0 2 E7 7 44 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 21584 ID: 1FFD8082 Buffer: 0 2 99 85 41 94 87 45 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 21723 ID: 1FFDC083 Buffer: 0 3 41 AA AA AA AA AA 
ISR - MB 99  OVERRUN: 0  LEN: 8 EXT: 1 TS: 21862 ID: 1FFD8082 Buffer: 0 3 41 EC 51 9A C4 64 
Node: 3	PacketID: 8	Broadcast: 0	Data: 159 200 72 1 81 231 7 68 148 135 69 65 
NODE 2 A: 0
NODE 2 B: 0.00
NODE 2 ELAPSED: 0
NODE 3 A: 0
NODE 3 B: 0.00
NODE 3 ELAPSED: 0

So as you say, when the packet groups are kept small, it seems immune to crashes. When they grow big, that's when the crashing happens.

Does this mean its a queue management issue?
 
Ok! Well, on one hand I'm glad to hear I didn't mess any of my code up, on the other hand, I was actually really hoping that I was wrong and the bugs were my fault so then I could just be corrected and keep moving on with my project ;)

Thank you for the help troubleshooting. I guess I'm at a dead end if I want to use this library for multiple nodes sending data blindly without any kind of central system to queue them and time their sending to never try and transmit on top of another packet series, correct?
 
Status
Not open for further replies.
Back
Top