Can decrease midi latency like Bela mini (based on beaglebone black arm cortex 1Ghz)



Currently I built my own midi controler. No audio part inside it, I only send midi message.

I want the best possible midi latency like Bela mini on the paper 0.5ms latency is possible.

Actually I use the internal ADC fort pots and fader to get 12bit value (for midi control is widely ok) from analog pin and bitshift to 14bit to send msb and lsb midi message.

In the future I need more analog pin and I add MCP3208 (12bit 8c) via spi.

I know on this forum if I want a very very optimised code I need to use DMA to get value from analog pin (MCP3208 and/or internal ADC), but if I can get midi latency under 2ms with standard library I will happy with this. (if my latency sound card 5ms + 2ms midi = 7ms I m under 10ms I m ok)

what do i need to do to achieve this goal ?

Thanks in advance.
If you're looking to optimize something, you need to be able measure it. Measuring MID latency is not trivial. Do you have a way to measure it?

Also, you need to decide what you mean by "latency". For example, you say that 0.5 msec is possible, yet a standard midi message takes 0.96 msec to transmit (3 bytes + start and stop bit per byte = 30 bits. 30 bits at 31250 bps is 0.96 msec). So, if you think that 0.5 msec is possible, you're clearly excluding the 0.96 msec transmission time. If you're excluding the transmission time, what are you including in your definition of latency? And, more to the point, how will you measure it?

I m not very clear excuse me for this.

My goal is to reduce the latency between the values are available from ADC until sending the midi message. All step between can be improved I think.

I know the delay to transmit data is about 1ms, and i can have jitter etc... But this elements is not the subject because I can't do anything to improve this.

My goal is to reduce the latency between the values are available from ADC until sending the midi message.
Even without DMA, a simple loop that reads a potentiometer and if its value has changed sends a suitable MIDI message to the USB host computer, takes much, much less than a millisecond per loop iteration on Teensy 4.x.

After reading an ADC value, you'll want to set up that ADC channel for the next pin –– reading each potentiometer pin in a round-robin order ––; then send the MIDI message if the value differs from the previous one, otherwise optionally delay for a few microseconds; and then start the next conversion. The midi message sending or the delay gives the ADC time to stabilize. Then, you wait until the conversion completes, and repeat. I do not see how DMA could speed up anything here.

The overall latency is almost certainly determined by your MIDI message transfer. If the potentiometer data has jitter, then each loop iteration will cause a MIDI message to be sent, meaning your overall latency will be determined by the number of potentiometers you have (since each one having jitter will cause a message to be sent). For example, via (non-USB) MIDI as chipaudette mentioned above, if you have 20 potentiometers and each one cause a message to be sent every loop iteration and each one takes 0.96ms, you will have 19.2ms interval between updates of the same potentiometer!
In other words, some kind of filtering seems necessary, to ensure unnecessary/unintended MIDI messages are not sent.

Personally, I prefer rotary encoders with incremental output, even though each one needs two (digital) input pins. They have no jitter, 600 PPR (1.667 pulses per degree) ones 'nearly affordable', and no absolute position; but bulky. (Meaning, although you do need some kind of indicator per encoder – perhaps a small 128×32 OLED display? – you can automagically switch or interpolate between preset states on the controller itself. And you're not limited to a specific angle; you can make each encoder even a multi-turn "pot" if you want, as it is just software at that point.)
Then again, I'm no musician: I found that out when playing with trackers in the mid-nineties (ScreamTracker 3; I had Sound Blaster AWE32 on PC).
Thanks Nominal and members of this forum to answer to my questions.

My pots or fader have no jitter because I filter the analog signal with RC for each (1kohm + 100nf).
I tested a lot with 3 pots and 1 fader and I have no midi sent when I don't touch the pots or fader.

If I read you correctly, If I touch 3 controls together (2 faders + 1 pots for exemple) I have (0.96ms * 3) + (few micro sec for analog read and test if value change per loop) (in case I use non USB-midi).

I currently test a 600PPR encoder, and it work very well but if I want to put 20 or 30 of them on a controler it is very very expensive (20€ for each on average on aliexpress).

And the problem with this encoder have a very very low physical torque. it is too easy to turn and none of them have center detent (for eq is an exemple)
If you are using teensy 4 you should be able to get low latency between ADC and MIDI send by using the ADC library in an ISR, you can set it up to read 1 pin per ISR with a counter that points to an array with the pin numbers. you can set the ISR to a high rate and use something like this so it does nothing if the ADC reading is not ready. an advantage to this method is that you can change ADC settings in setup, and it will just do reading as fast as it can with that setting. keep in mind ADC0 and ADC1 don't all have all the ADC pins available.

this is what I have in my ISR for ADC1 it reads 3 pins at 10 bits.

  if (adc->adc1->isComplete()) {
    CV_FILTERED[ADC1_CyC + 3] = 1024 - (adc->readSingle(ADC_1));
    ADC1_CyC ++;
    if (ADC1_CyC > 2) {
      ADC1_CyC = 0;
    adc->startSingleRead(cvPins[ADC1_CyC + 3], ADC_1);    

You could add something to test for change greater than n, and set a flag bit that tells MIDI it should send the new value.
thanks neutron7 I think I will implement this on my teensy 4.1 (I will buy nearly). Your example use ADC0 and ADC1 of teensy 4 that is only 10bit.

I need 12bit and I think I will use MCP3208 via SPI to get pots and fader value.

Do you think this method can work with external ADC via SPI interface ?
I see a lot of talk about MIDI, but can't tell whether you're talking about serial MIDI (eg, 5 pin DIN connector) or USB MIDI device (typically used for transmitting) or USB MIDI host (typically used for receiving).

With serial MIDI, the main latency is simply the slow 31250 baud rate. Each byte takes 10 bits times (1 start, 8 data, 1 stop). So a 3 byte MIDI message requires approximately 1 ms. Latency from your code transmitting the message to the beginning of that first start bit is usually at most 1 bit time, or 32 us worst case.

With USB MIDI, latency depends on complex interaction between buffering on both the host and device controllers. Assuming you'll use USB device mode, the main thing you should do is call usbMIDI.send_now() after you have sent 1 or more messages. Because USB MIDI groups up to 128 messages per USB packet, you'll face a fundamental trade-off between bandwidth versus latency. Each message you send gets written into the transmit buffer. Calling usbMIDI.send_now() causes and not-yet-fully-filled USB packet buffer to be immediately given to the USB controller for transmission. If you don't ever call usbMIDI.send_now(), the MIDI messages you wrote which didn't fill up a maximum size USB packet get transmitted on the next USB micro-frame, which is every 125 microseconds with 480 Mbit/sec USB speed of Teensy 4. Use usbMIDI.send_now() to avoid that 0 to 125 us latency.

Overall performance with USB involves so many complex factors that you probably will end up just experimenting. But you'll probably discover the latency is very low no matter what you do, if the events are human created. Usually these things only matter for machine-oriented scenarios, like trying to repurpose MIDI protocol for a large light show, where you send thousands of messages for each video frame. To give you an idea of the deeper USB factors, the main 2 practical issues are how your PC's USB host controller chip schedules bandwidth when it asks Teensy for more data but your program hasn't yet finished any USB buffers yet, and of efficiently the drivers and software on your PC are at handling incoming MIDI messages. This last point often ends up being the limiting factor. Teensy 4 is extremely fast and can send so many USB packets that even fast PCs can become overwhelmed if the receiving software is not highly optimized. Most host controller chips also have a power saving feature where they leave the USB bus idle for 20 to 50 microseconds if all of the periodic scheduled transfers (every 125 us) are completed and all non-periodic transfers have been attempted and all the devices responded with NAK for no data to transfer. Because of these complex behaviors, even if you don't care about inefficient use of USB bandwidth, you might find calling usbMIDI.send_now() only once after sending a group of MIDI messages gives better performance. Maybe. Or you might also find the performance is so fast that you can't really measure the difference without pretty special equipment. Especially if you use Windows on the PC side, you might discover the non-realtime scheduling of software by Windows is the limiting factor.
On Teensy 4.x, you can use MCP3208 via 24-byte transfers (or multiples thereof), with SPI clock at 1 MHz, giving you the state of all 8 MCP3208 analog inputs in 0.2 milliseconds (more specifically, 192 µs plus setup overhead). Using VCC=3.3V, you'll probably want to use 1k linear potentiometers; each potentiometer then dissipates 3.3mA. (The ends of the pot go to 3.3V and GND, and the wiper to the analog input on the MCP3208. You may wish to filter that 3.3V line and use it for both VDD and VREF on the MCP3208.) The MCP3208 itself consumes less than a milliamp, so round up to say 30mA at 3.3V per fully populated MCP3208 with 1k linear potentiometers. (It's not much, something like two bright indicator LEDs.)

The content of the 3 bytes sent and received via SPI per ADC reading are described in figure 6-1 (if using mode0) and 6-2 (mode1) in the datasheet I linked at the beginning of this post.

For output, the first byte will have value 6 for the four first channels, and 7 for four next channels; the next byte will be 0 for channels 1 and 5, 64 for channels 2 and 6, 128 for channels 3 and 7, and 192 for channels 4 and 8. In other words, the send buffer for all eight channels will be
const unsigned char mcp3208_out[24] = {
   0x06,0x00,0x00, 0x06,0x40,0x00, 0x06,0x80,0x00, 0x06,0xC0,0x00,
   0x07,0x00,0x00, 0x07,0x40,0x00, 0x07,0x80,0x00, 0x07,0xC0,0x00,

For the corresponding input, the third byte contains the low byte, and the second contains the high bits. For example,
unsigned int mcp3208_get_14bit(const unsigned char *inbuf, unsigned char channel)
    const unsigned char *b = inbuf + 3 * (channel & 7);
    return (b[2] << 2) | ((b[1] & 0x0F) << 10);
where inbuf is the 24-byte input buffer received via SPI when the mcp3208_out buffer is transferred. The function returns the value shifted left by two bits, so that the result is 14 bit.

Another option is to convert all eight in a loop, and return a bit mask of which values were changed:
unsigned int mcp3208_decode(const unsigned char *inbuf, unsigned int *state)
    const unsigned char *const inend = inbuf + 24;
    unsigned char  changed = 0;
    while (inbuf < inend) {

        /* New 14-bit state for this analog input */
        unsigned int  newstate = (inbuf[2] << 2) | ((inbuf[1] & 0x0F) << 10);

        inbuf += 3;

        /* Since we want the first entry at LSB of changed, we need to shift right. */
        changed >>= 1;

        /* If state changes, set the corresponding bit (lowest bit = first channel) */
        if (newstate != *state) {
            changed |= 128;
            *state = newstate;

    return changed;
Given the 24 bytes received via SPI from an MCP3208, and the current 14-bit values of the potentiometers, calling the above function will update the 14-bit values and return which ones changed. For example, if the first and third channels have changed, it will return 0x01 + 0x04 = 0x05 = 5 in decimal.

Even if you use 32 potentiometers (using four separate output pins for the /CS line of each MCP3208), it'll still take less than a millisecond, total, to read all their states. Therefore, it does not matter here whether you use DMA or not. DMA will let you do other stuff while waiting for the conversions and SPI transfers to complete, though.

With Teensy 4.x, I really don't see why one would worry about the latency between sampling the ADCs and being ready to send a MIDI message, as that latency is on the order of 25µs per analog channel using MCP3208 at 1 MHz SPI clock. It is much more worthwhile to make sure one does not send unnecessary MIDI messages, because it very much seems to me that it is the MIDI data channel that is the bottleneck here, not the ADC at all.
Another possibility for those knobs would be Hall-based magnetic encoders, like AMS AS5600: a 12-bit (4096 positions per revolution), readable via I²C (fixed slave address 0x36) at max. 1 MHz clock. Each encoder can be sampled once per 0.286ms, 0.55ms, 1.1ms, or 2.2ms, depending on the amount of noise one accepts. It is a simple chip to interface to, only needs an 1µF ceramic capacitor and pullup resistors for I²C, and you can find breakout boards at eBay (at around 20€ for five). The downside with these is that the magnet placement on top of the chip is critical; we're talking about within 0.2mm. However, if you happen to have a 3D printer, you can print knobs with holders for the magnets with a bottom flange, and a separate holder plate the knob pokes through with standoffs to precisely locate the breakout board; this should be precise enough – although each breakout board might need either standoff location adjustment, or re-soldering the AS5600 chip. Add a small spring and a plastic/copper ball bearing (1-3mm diameter), and you can make any number of detents you want, too.

(Because AS5600 all have the same I²C slave ID, you'll also need I²C expanders, like TCA9548A, to connect more than one per I²C interface.)

I haven't done this yet myself, but I have looked into it, as both a human input device, as well as motion encoder (stepper motor rotary encoder) for 3D printing. It's definitely more work, but possibly also more rewarding, as you would be fully in control of the knobs. I definitely wouldn't try this without having a 3D printer, though.
Isn't Bela's claims about its audio latency? I didn't see anything about low MIDI latency on their pages.

Unresponsive continuous controllers in DIY projects are often from excessive signal filtering or naive handling of the MIDI messages in the code. The delays will be tens of milliseconds before you're likely to notice on something like filter parameters.

Teensy is fast enough to handle the data with negligible overhead without resorting to advanced programming as it's main loop is typically running orders of magnitude faster than any reasonable stream of MIDI data. Making sure the stream is reasonable is the key.

Are you sure you need 12-bit resolution and LSB CC data? If it's smoothness not accuracy you want it really shouldn't be done all in MIDI with a ton of messages. (If you're getting 'zippering' check to see if there is a 'smoothing' setting for MIDI control.)

If you are going with LSB data you really need to ensure you manage the traffic you're generating as Teensy's speed can easily overload the receiving device.