I appear to be running into a new issue. I have added a third node and am stress testing the system to see how much traffic it can handle, and I seem to be crashing my master.
I can share source code if you think its an issue with my coding, I'll admit its very messy just while I learn.
The setup is as follows:
Node ID 1: Master, does not transmit anything, just receives. This is the one that crashes. Connects to can2.0 over can1 at 1000000 baud.
Node ID 2: Slave1, sends a 20 byte payload once per 1000ms, id 7.
Node ID3: Slave2, sends a 12 byte payload once per 33ms, id 8 (stress test to replicate how fast I'd like some of the data to go for gauges and whatnot in real time)
System works perfectly for approximately 10-60 seconds, until in the serial monitor for the master it receives a larger packet then is thought to be possible, and then crashes the teensy.
Normal packet from node2:
Code:
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 10226 ID: 1FFD4082 Buffer: 0 0 0 14 BE 0 FF FE
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 10374 ID: 1FFD8082 Buffer: 0 1 7 17 C8 70 2 4E
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 10517 ID: 1FFD8082 Buffer: 0 2 66 2E 42 94 87 45
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 10656 ID: 1FFD8082 Buffer: 0 3 41 EC 51 9A C4 64
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 10798 ID: 1FFDC082 Buffer: 0 4 20 F1 47 AA AA AA
Node: 2 PacketID: 7 Broadcast: 0 Data: 23 200 112 2 78 102 46 66 148 135 69 65 236 81 154 196 100 32 241 71
Normal packet from node3:
Code:
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 55385 ID: 1FFD4083 Buffer: 0 0 0 C BE 11 FF FE
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 55532 ID: 1FFD8083 Buffer: 0 1 8 BE C8 A4 2 B1
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 55673 ID: 1FFD8083 Buffer: 0 2 5A 91 43 94 87 45
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 55813 ID: 1FFDC083 Buffer: 0 3 41 AA AA AA AA AA
Node: 3 PacketID: 8 Broadcast: 0 Data: 190 200 164 2 177 90 145 67 148 135 69 65
Here is my console right before the crash. Ignore the other stuff, that's just testing to see if data is actually updating in a format I can read as it screams past.
Code:
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 36653 ID: 1FFD4083 Buffer: 0 0 0 C BE 87 FF FE
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 36799 ID: 1FFD8083 Buffer: 0 1 8 66 C8 E7 5 9B
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 36940 ID: 1FFD8083 Buffer: 0 2 E0 C 44 94 87 45
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 37081 ID: 1FFDC083 Buffer: 0 3 41 AA AA AA AA AA
Node: 3 PacketID: 8 Broadcast: 0 Data: 102 200 231 5 155 224 12 68 148 135 69 65
NODE 2 A: 81
NODE 2 B: 100.60
NODE 2 ELAPSED: 1459
NODE 3 A: 102
NODE 3 B: 563.51
NODE 3 ELAPSED: 1511
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 5117 ID: 1FFD4083 Buffer: 0 0 0 C BE 1B FF FE
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 5264 ID: 1FFD8083 Buffer: 0 1 8 67 C8 E7 5 1
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 5406 ID: 1FFD8083 Buffer: 0 2 E7 C 44 94 87 45
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 5546 ID: 1FFDC083 Buffer: 0 3 41 AA AA AA AA AA
Node: 3 PacketID: 8 Broadcast: 0 Data: 103 200 231 5 1 231 12 68 148 135 69 65
NODE 2 A: 81
NODE 2 B: 100.60
NODE 2 ELAPSED: 1459
NODE 3 A: 103
NODE 3 B: 563.61
NODE 3 ELAPSED: 1511
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 39116 ID: 1FFD4083 Buffer: 0 0 0 C BE 78 FF FE
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 39263 ID: 1FFD8083 Buffer: 0 1 8 68 C8 E7 5 67
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 39403 ID: 1FFD4082 Buffer: 0 0 0 14 BE C4 FF FE
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 39550 ID: 1FFD8083 Buffer: 0 2 ED C 44 94 87 45
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 39688 ID: 1FFD8082 Buffer: 0 1 7 52 C8 B4 5 E8
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 39828 ID: 1FFDC083 Buffer: 0 3 41 AA AA AA AA AA
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 39967 ID: 1FFD8082 Buffer: 0 2 65 C9 42 94 87 45
Node: 3 PacketID: 8 Broadcast: 0 Data: 104 200 231 5 103 237 12 68 148 135 69 65
NODE 2 A: 81
NODE 2 B: 100.60
NODE 2 ELAPSED: 1459
NODE 3 A: 104
NODE 3 B: 563.71
NODE 3 ELAPSED: 1511
ISR - MB 99 OVERRUN: 0 LEN: 8 EXT: 1 TS: 40250 ID: 1FFD8082 Buffer: 0 3 41 EC 51 9A C4 64
It seems like normally, node 2 sends 5 frames at once, node 3 sends 4 frames at once. Then it receives something it perceives as one giant group of many more, and locks up. I guess in some tricky situations where the timing is juuuuuuust right, TeensyCAN library can lose track of which packets belong to what frame and just combines everything it hears into one giant packet? Thats what I'm understanding from the console dump above.
Is there anything I can do to stop this from happening? I feel like this quirk of timing can happen to any group of nodes, it just happens very quickly when I have one of them transmitting very quickly.
At first I thought that it was crashing because my code was receiving a larger then expected packet and it may have been overflowing, but this is the code I use to copy the buffer into the structure. I will be changing it to a state machine eventually, but for now this is my quick and dirty test:
Code:
if(info.node == 2 && info.packetid == 7) { //if node is 2 and canbus packet id is 7, copy buffer to structure
for(int i = 0; i < sizeof(testPacket_t); i++) { //step through total size of buffer to write in data
datatest.testPacket[i] = buffer[i]; //copy one by one
}
}
if(info.node == 3 && info.packetid == 8) { //if node is 2 and canbus packet id is 7, copy buffer to structure
for(int i = 0; i < sizeof(testPacket2_t); i++) { //step through total size of buffer to write in data
datatest2.testPacket2[i] = buffer[i]; //copy one by one
}
}
As you can see above, at first I thought when the system gets a larger then expected packet, it would overflow the array and crash. But I realize that I'm setting my for loop count using the size of the specified structure, not the received data. So it should be impossible to overflow the byte array and only possible to simply just corrupt numbers. Unless writing a float incorrectly can also crash the teensy?
We are at the very edge of my understanding here haha..
I calculated that the worst case packet with bit stuffing and gaps for 6 transmission clocks results in me being able to send around 6200 full sized bit stuffed packets per second at 1000000baud. This is nowhere near that, at what I believe is only around.. 121-122 packets per second, considering that each node 2 transmission is 5 canbus frames @ 1/sec and each node 3 transmission is 4 canbus frames @ 30/sec.
Let me know if you have seen anything like this before. While I was typing this I set the node 3 speed down to just 5 per second and it did admittedly take a lot longer to crash but it did crash still, for the same reason, with a huge packet of data in the bottom.
Something else to make note of is that when I relaunch the master, both slaves start transmitting again happily, they survive this just fine.
Another observation is one node sending a 5 frame payload (20 bytes) at 100 sends per second (500 can frames per second) was left running for 16 hours and it didn't miss a single packet or crash at all. Only when I introduced a third node to the system has the severe instability started.
I will attach all 3 sketches on a following post.