Pyserial script throwing OSError: [Errno 5] Input/output error while communicating with T41

shookti

Member
I am maintaining a project that performs long term (weeks/months) communication with a Teensy 4.1 using Python3 and pyserial (version 3.5) that runs on both Ubuntu 22.04 and Raspbian Bullseye (basically various Linux distros).

The pyserial Serial object is created using the "/dev/serial/by-id/usb-Teensyduino_USB_Serial_13124230-if00" symlink to the corresponding /dev/ttyACMx port and a baudrate of 500000, and yes I am aware that baud rate is somewhat of a misnomer in this case as USB communication assumes the native communication speed.

The code that the Teensy is running uses Serial.print() statements to transmit data to the computer (no usage of Serial.write at all).
The Teensy has a timer interrupt running at 1 kHz (good old IntervalTimer, not TeensyTimerTool) that prints lines that look like this: "A,123456,2795,2820,"
There are other lines that are printed outside of the interrupt that look like: "K,11,62,1,440471,0,10,10,"

I know that using Serial.print in an interrupt is extremely finicky, but the project I am working on requires such an interrupt for a lot of other functionality to work correctly.
I do see artefacts of this finickiness when the "K,11,62,1,440471,0,10,10," (non interrupt) are interrupted by the "A,123456,2795,2820," (interrupt) lines causing lines to look like:
interrupted:
K,11,62,1,A,0,2795,2820,
0440471,0,10,10,


Those lines can easily be dealt with on the python side where the data parsing happens so I am not super worried about them.

The Teensy stores none if not one String in memory that is around 30 chars long.

The problem I am facing is that every once in a while - could be a few seconds after the program starts running or a few hours - the Python script which is constantly saving lines sent by the Teensy to disk throws the following error:

Code:
Traceback (most recent call last):
  File "/home/shookti/OneZero/ATM/AutoTrainerModular/MainCode.py", line 850, in <module>
    main()
  File "/home/shookti/OneZero/ATM/AutoTrainerModular/MainCode.py", line 831, in main
    debugMonitor(ser)
  File "/home/shookti/OneZero/ATM/AutoTrainerModular/MainCode.py", line 686, in debugMonitor
    chunk = ser.readline(ser.inWaiting())
  File "/home/shookti/.local/lib/python3.10/site-packages/serial/serialutil.py", line 594, in inWaiting
    return self.in_waiting
  File "/home/shookti/.local/lib/python3.10/site-packages/serial/serialposix.py", line 549, in in_waiting
    s = fcntl.ioctl(self.fd, TIOCINQ, TIOCM_zero_str)
OSError: [Errno 5] Input/output error

What might be some possible causes for this error?

Could this line, that uses the python built in library to perform file descriptor operations
Code:
s = fcntl.ioctl(self.fd, TIOCINQ, TIOCM_zero_str)
imply that there might be something going wrong with the file descriptors attached to the script or the "/dev/serial/by-id/usb-Teensyduino_USB_Serial_13124230-if00" file?

I have tried my hand at checking whether the Teensy is crashing due to a memory leak by having an interrupt that blinks the BUILTIN_LED, and the blinking never stops even when that error is thrown, this probably is not a good way to test for memory leaks (tips on how to do so would be great).

Sometimes, even more rarely than the above error,

Code:
chunk = ser.readline(ser.inWaiting())
try:
    data += chunk.decode("utf-8")
except Exception as e:
    print(e)
    print("\ndata:%s\nchunk:\n%s\n" % (data, chunk))

catches an invalid "invalid start byte" error and resumes normal execution.

Trying to deal with the
OSError: [Errno 5] Input/output error[/CODE]
by using a try-except block also works fine most of the time, but I am worried about losing data that the Teensy sends when the try except block is doing its thing, the try-except block takes a few seconds, a significant part of which comes from a time.sleep line that waits for the /dev/serial/by-id/ file to be created.

I would greatly appreciate it if someone could tell me what would reproduce this issue in some simpler code so that I can then draw parallels with it in this large project or guide me through the process of debugging this.

Thanks!
 
Last edited:
Have seen an interrupt like LED BLINK continue after loop stops, so that isn't the best measure. Maybe if the blink was altered based on processing by loop() funning - or other interrupt code - to do double blink or switch from off/on/off blink to on/off/on blink showing a diff pattern.

If printing from interrupt and in loop() it can be assured the output can easily conflict and intermix.
> if the interrupt sprintf()'d text to a buffer then loop() could display it. Perhaps do this as a debug step if nothing else to see the effect.

What is the print/spew rate of the teensy output?
> The Teensy can push data out too fast for the best handlers to process - python has been seen to be too slow without detailed attention by others.

If there were real faults then CrashReport would show them after a Teensy 8 second pause and restart
 
Have seen an interrupt like LED BLINK continue after loop stops, so that isn't the best measure. Maybe if the blink was altered based on processing by loop() funning - or other interrupt code - to do double blink or switch from off/on/off blink to on/off/on blink showing a diff pattern
I have verified that the Teensy does not crash by using the try-except block that catches the OSError errno 5 and recreates the pyserial connection. I have seen that the data the Teensy is reporting after that are not random numbers but accurate results of some computations done by the Teensy.
However, there are instances when the try-except block fails to recreate the connection and resume normal operation - throws an error saying that "/dev/serial/by-id/usb-Teensyduino_USB_Serial_13124230-if00" file does not exist, maybe this is when the Teensy is actually crashing. Have put another try-except block nested within with a longer sleep time (10 seconds) to debug this.

if the interrupt sprintf()'d text to a buffer then loop() could display it. Perhaps do this as a debug step if nothing else to see the effect.
The code has been like this (an interrupt using Serial.print, and lines being annoyingly interrupted) for many months now without facing this OSError Errno 5 issue before.

If there were real faults then CrashReport would show them after a Teensy 8 second pause and restart
Could this be helpful even after verifying (weakly, as mentioned above) that the Teensy does not crash? If so, where is the documentation that describes how to obtain and understand crash reports from a Teensy 4.1?
 
Is keeping track of a bunch of variables storing ints, let's say 5, that are defined at the beginning of the program and are reported periodically to the computer a good way to find memory leaks?
 
Back
Top