Issue with USB MIDI failing over time when using usbMIDI.sendRealTime

zenbob - on the PC can you close and restart the PC plotting app without disturbing the Teensy? Or at least resize it or make some adjustment that would leave it running but cause it to flush its stored log data for the signal?

On your PC did you have TaskManager open to the Processes or Performance tabs and monitor GPU performance - if your PC has an active GPU unit? Paul will likely do that on his next run as it may show a spike to explain his fan powering up at the time of the stoppage.

If no GPU perhaps on the Performance tab with CPU selected right click on the graph and enable 'Show kernel times' - it may show increased activity.
 
I'm starting another test, this time with the USB protocol analyzer and a frequency counter connected to pin 13.

freq.jpg

When the problem happens again, I *really* want to see if this is still 4 Hz, or changes relative to what Windows sees.


It sounds like Windows enters into the 'condition' not Teensy.

Certainly some of the evidence looks that way, but I'm not ready to blame everything on Microsoft just yet...
 
Here's the test hardware.

DSC_0199_web.jpg

The laptop is a 2011 Macbook Pro (i7-2760QM & 2.4 GHz, 8GB RAM) which dual boots between MacOS and Windows. It's running Windows 10, version 1803 (OS Build 17134.228)

My Linux desktop (i7-8086K @ 4 GHz, 64GB RAM) is running the protocol analyzer software to log all the communication, so the logging doesn't place any load on the Windows machine.
 
Did you find and enable the GPU column in TaskMan? Assuming the MacBook has a GPU unit to monitor independently that might show spike on that App.
 
Here's what the USB communication looks like when it's working. This is after running about half an hour.

sc.jpg

The 2 highlighted lines are the Note-off messages the code sends with:

Code:
  if ((gMIDI_Count + 18) % 24 == 0) { // 18 is 75% (of 24) gate time
    usbMIDI.sendNoteOff(67, 0, 1, 0);
    usbMIDI.sendNoteOff(67, 0, 1, 1);
    usbMIDI.sendNoteOff(67, 0, 1, 2);
    usbMIDI.sendNoteOff(67, 0, 1, 3);
    usbMIDI.send_now();
    //Serial.println(F("Stop Note..."));
  }

If you look at the timestamps, 34:56.826.668 and 34:57.326.721, they're almost exactly 0.5 second apart. The timing looks very good.

Likewise, if you subtract any 2 timestamps from the many real-time clock messages (the packets with 0F F8... data), they're all about 20 to 21 ms apart.
 
Defragster - wrt to the monitoring app, it hasn't mattered if the app is running or not, or which app is used to monitor. In other words, for MIDI it doesn't seem to matter if we "consume" the data in Windows. Besides MIDIClock, we've used MIDI-OX and various Windows MIDI instruments such as DEXED and Arturia instruments to observe behavior. We've tried lots of permutations but the only thing with an observable effect is opening a serial terminal which causes it to recover briefly.

I'll check the GPU, but I haven't had it fail in the past two days. :confused: Going to try a reboot now. Oh, I just rebooted my music desk PC and it got the .228 update but my work desk PC hasn't updated.
 
Interesting - I was just keying off the fan speed rise from Paul assuming an app issue - it may have indeed been Windows that got overwhelmed given it happens with no app or other apps.

Paul is another hour closer to seeing the issue while monitoring ...
 
Ok, it finally started happening. Something is going very wrong. Looks like it may be on the Teensy side.

I'm going to add some more hardware (and restart the test all over again) to dig into interrupt timing with my scope.

bad.jpg
 
Defragster - wrt to the monitoring app, it hasn't mattered if the app is running or not, or which app is used to monitor. In other words, for MIDI it doesn't seem to matter if we "consume" the data in Windows. Besides MIDIClock, we've used MIDI-OX and various Windows MIDI instruments such as DEXED and Arturia instruments to observe behavior. We've tried lots of permutations but the only thing with an observable effect is opening a serial terminal which causes it to recover briefly.

I'll check the GPU, but I haven't had it fail in the past two days. :confused: Going to try a reboot now. Oh, I just rebooted my music desk PC and it got the .228 update but my work desk PC hasn't updated.

I finally got a fail again, nothing to speak of wrt the GPU or other utilization. So I did the reboot test, Teensy stayed running and it recovered the BPM indicating that restarting the Windows side at least temporarily clears the issue.
taksmanagercapr.PNG
Update: After reboot, as expected if failed again in about 6 minutes. Opening a serial terminal did not restore it completely, maybe 50% better clock tracking before it went awry again.
midiclockafterrebootserial.png
 
Last edited:
I'm *still* working on this problem here. I'm almost certain the problem is on the Teensy side. It's looking like some sort of race condition deep within the USB stack, probably related to doing the transmission from this IntervalTimer interrupt. Windows just happens to do things with timing that reproduces the problem.

This is a very hard problem, especially since it takes many minutes to reproduce it. I found reducing the number of buffers in usb_desc.h makes it happen sooner (at least with my Windows test machine), but ~950 seconds is the fastest I've found so far. The sensitivity to the number of buffers suggests this might be a memory leak in the USB stack.

Almost noticed, perhaps related or maybe a separate issue, is the Serial.println() from this interrupt doesn't work with Linux, unless a program is listening for the MIDI messages. I started working with that today, but it's also still a mystery.

It's going to me a while to really get to the bottom of this. I'm going to work with this more Saturday afternoon, then I'm going to be away Sunday and probably Monday for a trip to Seattle to help Burning Man friends with last-minute project building. Will be back on this later next week. I will eventually get to the bottom of this problem. It's a tough one, so this is just going to take time.
 
Happy to have given you a real challenge :D Have fun on that burning man project, I'm sure it must me interesting!

In the meantime, we have a workaround for our Teensy powered widget; we just changed the default MIDI ports from "all" to the 5 pin DINs so customers won't get into wonky state just because they are powering from a PC USB host. Anyway, we are still a few weeks from shipping so maybe we'll get lucky with a fix. By the way, we are in Portland and if there's anything we can do to assist or test, just let me know.
 
Please give this fix a try. Does it help?

On Windows, the default location is C:\Program Files (x86)\Arduino\hardware\teensy\avr\cores\teensy3.
 

Attachments

  • usb_midi.c
    14.2 KB · Views: 80
I have one still running at about 4500 secs, the other croaked at about 2700 secs :(. I'm going to reboot the system and restart the test.

The first system failed too, at about 7800 secs. As before, opening the serial monitor recovered it to a steady 120BPM.
 
Any chance that file didn't get installed or used? I ran it here for 20000 seconds.

Started a 2nd test, already up to ~3400 seconds. So far, so good.

In a few days I'm going to make the first 1.43 beta installer. Maybe give that a try?
 
Any chance that file didn't get installed or used? I ran it here for 20000 seconds.

Started a 2nd test, already up to ~3400 seconds. So far, so good.

In a few days I'm going to make the first 1.43 beta installer. Maybe give that a try?

I tried changing build options to LTO to force recompile of libraries but it didn't help. We also tried putting a serial.print in the .c file but that doesn't work. So we've built our full codebase with the new usb_midi.c and have that running on two systems and >9000 seconds without a problem. Looking at the build log it looks like its picking up the usb_midi.c from the expected path.

Yes, we'll definitely try the 1.43 beta.
 
Easy way to have the compiler show you it is being compiled is to drop off a semi-colon or other syntax error.

Closing the IDE and coming back in is sure to dump the temp build files. Though usually the compiler dependencies catch newer files.

Paul pushed a change I was looking at the other day - and I put the file in the wrong installed IDE tree and oddly the fix didn't work …
 
I've been running it here since last night. Still at 120 bpm. Frequency counter still shows stable 3.999996 Hz on pin 13.

capture.png

freq.jpg

Need to shut this test down now, so I can get ready for Arduino 1.8.6.

Will post here again when there's a 1.43-beta1 installer to test.
 
Yes, its looking good Paul! I think none of my tests have been valid for one reason or another. For instance, on our codebase, we've already removed all of the print statements so there was no serial output. So it wouldn't have failed.
I'm also following defragster's advice to test the build of the test code.
Thanks for looking into this and coming up with a solution so quickly!
 
I'll give it a shot Paul, but unfortunately we've been readily reproducing the issue with the new code. I did the "remove a semicolon" test and verified I was building with the new bits. Last week I couldn't reproduce the issue. I was messing with the code trying to push more bits with a BPM of 650 and lots of serial prints and it just ran longer and longer as the week went on! So full moon weird. Then this week I went back to the code I posted above, built with the new usb_midi.c and it failed in about 30 mins.

The only thing I noticed was different, is that opening the serial terminal recovered it, it failed again, and it recovered a second time opening the serial terminal. Never had recover a 2nd time before, though its probably insignificant.

On another system, we built with the new usb_midi.c with our full codebase (well over 10k lines of code). Its been running for a couple of days now, but I don't expect it to fail because there are no serial prints. It won't fail with: MIDI notes (usbMIDI.sendNoteOn) and serial prints without MIDI clock (usbMIDI.sendRealTime), or with MIDI notes and MIDI clock without serial prints.
 
Back
Top