Audio For Teensy3 - What Features Would You Want?

Status
Not open for further replies.
I don't understand your question... I don't know what can't be done by software and need hardware.
The synth can play and MIX together 32/64/128 wav files, with a low latency and send them to the audio out hardware.
Also there is some settings that vary with the time.
For example, loop a wav file of a piano note, then, play it for 10 seconds, decreasing the volume every time, following volume envelopes.
Also the same for pitch or filters envelopes.
 
Volume/pitch/pan envelopes are more of a software features. I see HW doing the heavy lifting and software controlling it for various effects. This keeps HW simplier and more general purpose. So I think it's better to extract HW features from the format and implement support for various formats in software. Thinking of it, might be good to look into XAudio or OpenAL interface to get an idea what kind of features you should have in HW
 
The hardware need fast access to memory where the wav/soundfonts files are.
Maybe adding external memory 32MB/64/128/256 ...
 
Might be also good to add support for negative volumes to enable implementation of Dolby Surround. Also doing FFT on the master buffer would be nice support for visualization etc.
 
I've been looking at the SoundFont 2.04 spec. It's quite complex. There is still much I need to digest, but it seems the core function is playing sampled clips at variable rates, where the latter part of the clip can loop. A delay-attack-hold-decay-sustain-release envelope, resonant lowpass filter, tremolo, vibrato, chorus and reverb effects can be applied and the parameters for all those effects can vary. There also seems to be some sort of layering capability, involving multiple clips playing and mixing together, but I do not yet understand that part of the SoundFont format.

I found numerous dead links while searching. Here's a copy of the Soundfont spec I found after much searching.....
View attachment sounfont_specifications_v2.04.pdf
 
Yeah, I'm totally for supporting existing formats, but just wondering what kind of features it has that require HW support (essentially special support from the mixing routine)

Obviously it's possible to do entirely in software, because FluidSynth can, at least using a powerful processor. How well a Teensy3 with only a 96 MHz Cortex-M can do it remains to be seen.....
 
... good to add support for negative volumes to enable implementation of Dolby Surround.

Can you post a link to any technical documentation about these algorithms?

Please keep in mind this needs to be public domain, create commons or otherwise free to use. PJRC certainly does not have the ability to license proprietary Dolby technology, and even if we could, my intention is to publish the entire audio library as open source.
 
Here's what the audio shield will probably look like....

top.png bottom.png

http://oshpark.com/shared_projects/2k03eMcK
 
I don't have documentation, but what you do is play the sample you play in left channel negated on right channel to make it appear in the surround channel. So you need ability to do 180 degree phase shift, i.e. multiply the sample with negative volume.
 
Obviously it's possible to do entirely in software, because FluidSynth can, at least using a powerful processor. How well a Teensy3 with only a 96 MHz Cortex-M can do it remains to be seen.....
You need some support in your device if you are going to support wavetables though. If you plan to provide only master buffer access and have essentially a simple DAC with a buffer, then you have to do everything in SW on host controller. Supporting wavetables makes it more complex but obviously more efficient for the host. Supporting envelopes and various effects require update of the audio channels in pretty low frequency (e.g. 50Hz, like in mods) and doesn't put much stress on the host controller, so you may want to consider keeping that part in SW instead to simplify the audio device and keep it more generic, maybe.
 
The optional "MEM" chip on bottom side is meant to be a W25Q64 or W25Q128 flash memory, for storing short sound clips or (maybe) wavetables.
Is this memory able to provide ~5.4MB/s random access (4 bytes / read)? That's what's required for 32 channel 16-bit stereo mixing at 44.1KHz. From datasheet I saw it's able to do 50MB/s continuous but didn't see random access performance.

Edit: Actually double that if you do linear interpolation, so 10.8MB/s (8 bytes / read)
 
Last edited:
The W25Q128 is capable of pretty high speeds, but the SPI port on Teensy3 is limited to 24 Mbit/sec, so 3 Mbyte/sec is the absolute maximum possible speed. Each random access incurs an overhead of 4 bytes, but then may read any number of sequential bytes at the full speed.

I have some ideas for software and buffering and conversion to an intermediate format that (hopefully) may help make the most of the hardware. But soundfonts that make heavy use of the extra/optional features like the parameterized resonant filter, vibrato, and reverb are going to chew up CPU time. Even with the SIMD optimizations (which certainly aren't going to be used in early releases), there's only so much of that CPU intensive processing a microcontroller can do.
 
Ok, but I thought you want to do mixing on the audio microcontroller, not on Teensy, or did I miss something? That's the point of wavetables after all, not to stress host microcontroller.
 
Hmh, isn't this ~200MHz 32-bit chip?
96MHz.

I'm currently mixing 12 8-bit mono channels at 37KHz on 16MHz/8-bit Arduino, so surely you could do much much better than that!
Assuming for the moment that performance scales linearly with clock speed and independent of architecture (which it doesn't) you have 12 channels of 8 bit mono on a 16MHz chip and are asking for 32 channels of 16 bit stereo which is 32/12 * 2 * 2 * 16 = 170 MHz. Teensy 3.0 is a 48 MHz chip clocked to 96 MHz.

If you really need 32 channel stereo polyphony with significant processing on each channel then a platform like Beaglebone (720MHz ARM Cortex-A8, 256Mb DDR2) or BeagleBone Black (1GHz ARM Cortex-A8, 512Mb DDR3, floating-point accelerator) would seem more appropriate.
 
Ok, but I thought you want to do mixing on the audio microcontroller, not on Teensy, or did I miss something? That's the point of wavetables after all, not to stress host microcontroller.
What audio microcontroller? The Wolfson Audio Codec WM8731 is an I2S DAC and ADC, not a microcontroller.
 
Well duh, I thought the plan was to have microcontroller on the audio shield doing all the mixing!
 
The shield has only the codec chip. The codec is likely to be SGTL5000. All the processing and mixing happens in the MK20 microcontroller chip on Teensy3.

I've been working with the WS731 chip and the SGTL5000, and of course starting a library over the last few weeks. The ARM Cortex-M is indeed much faster than an 8 bit AVR on normal Arduino, but there are limits. I'm pretty sure the MK20 could easily read 32 arrays from flash, sum them together and feed that to the I2S output. But 32 soundfonts making heavy use of the CPU-intensive extra features is another matter. At this early stage (no actual code written to do all that stuff), I don't anyone could really estimate accurately how many simultaneous voices can be played.

The hardware and audio library isn't ready for release yet, but it will be within a matter of weeks. The Teensy3 of course is available now, so is Hyple's I2S library, so if anyone wants to fiddle with code to implement wavetables and investigate what is (or might be) possible in more detail, they certainly can.

I'd also be curious to know how FluidSynth performs on a Beaglebone or Raspberry Pi, if anyone can find that info or give it a try?
 
I'd also be curious to know how FluidSynth performs on a Beaglebone or Raspberry Pi, if anyone can find that info or give it a try?

not sure about fluidsynth, but there's plenty of people using pd or supercollider with these boards of course. i haven't seen any performance comparisons or analyses, would be curious too. ccrma satellite doesn't even include fluidsynth by default, it seems; only supercollider, pd, faust and chuck, which i guess makes a lot of sense when you think about the user community presumably targeted by ccrma (well, and that being more interesting software from a diy perspective). in my own experience with rpi/pd, one can easily run a lot of osc~ objects (20+) but haven't tried to max things out. a guy from the local art school, who use BBB for supercollider stuff, tells me BBB performance is better.

i'm not sure though whether those linux boards are the best reference points. in the same ballpark, preenFM comes to mind for instance (maple mini), which i think can handle 6 or 8 notes polyphony. but there's as much to be said about focusing on one voice that sounds good, rather than many voices that sound bad. also of course, one voice can get CPU intensive enough in its own terms. the less toy like things i have seen tend to be monophonic. the mutable instruments "braids" for instance contains a stm32f4, i believe, sounds pretty decent (i haven't seen the code but apparently it can do all sorts of things, bandlimited wavetable; FM synthesis; some physical modelling stuff etc). there's similar things which are dsPIC33 based, all sounding nice.

looks as if that W25Q128 is the same footprint as the 23LC1024, so i figure it will be possible to try these out, too. nice.
 
looks as if that W25Q128 is the same footprint as the 23LC1024, so i figure it will be possible to try these out, too. nice.

Yes. In fact, I made a special SOIC-8 footprint that (hopefully) can accept the wider W25Q128 or the narrow 23LC1024.

in the same ballpark, preenFM comes to mind for instance (maple mini), which i think can handle 6 or 8 notes polyphony. but there's as much to be said about focusing on one voice that sounds good, rather than many voices that sound bad.

My thoughts exactly. Well, except I'm really shooting for at least 4 note poly, and of course more if I can. :)

Then again, FM (or at least phase modulated) is extremely easy. I already wrote a modulated oscillator, but haven't tested it much yet. I've been focusing mostly on the I/O objects, library API, and hardware for a first release. In fact, a first release will likely only support a W25Q128 chip for audio clip playing, so I can test the hardware before PJRC sells the boards. More advanced software support, like soundfont-based wavetables, can come later.
 
Yes. In fact, I made a special SOIC-8 footprint that (hopefully) can accept the wider W25Q128 or the narrow 23LC1024.

i see, will be curious to try it out.

Well, except I'm really shooting for at least 4 note poly, and of course more if I can. :)

Then again, FM (or at least phase modulated) is extremely easy. I already wrote a modulated oscillator, but haven't tested it much yet. I've been focusing mostly on the I/O objects, library API, and hardware for a first release. In fact, a first release will likely only support a W25Q128 chip for audio clip playing, so I can test the hardware before PJRC sells the boards. More advanced software support, like soundfont-based wavetables, can come later.

for basic NCO based stuff, that should be perfectly possible, i would have thought. i don't have the tools to run any systematic performance measures, but using hpyles DMA i2s library, i have experienced no problem whatsoever (as a coding non-expert == inefficient code) running 4 slightly detuned instances of an oscillator (which is basically 4 calls to arm_linear_interp_q15 per sample plus mixing it all together; i'm not using envelopes, these come from somewhere else), i haven't tried to push it though and still need to implement bandlimited wavetables, which will roughly double the processing going on.
 
I'm planning to port my mod player to Teensy 3.0 so that will give you some reference performance numbers at least. I optimized the mixing and interrupt routines in AVR asm, but there's also equivalent C++ versions which are roughly half the performance, so it should be fairly straightforward to do the first port to Teensy and then work on Cortex asm optimized version. I just need to figure out how to setup timer interrupt to run at given frequency on Teensy. I had a quick look at Cortex-M4 instruction set, and since it's 32-bit and runs 3x faster than my Uno, I would expect it to be able to handle 32 channels at 44KHz @ 48MHz, even with stereo panning and linear interpolation. I' not familiar with Cortex SIMD instruction set either, but if it's anything like SSE on PC, that should give nice additional boost to increase the number of channels even further. I'm currently adding support for volume envelopes, but after that could move on the Teensy port.
 
I'm planning to port my mod player to Teensy 3.0 so that will give you some reference performance numbers at least.

Great. I'm really curious to hear it. :)


I just need to figure out how to setup timer interrupt to run at given frequency on Teensy.

Use IntervalTimer. Just create an IntervalTimer object, then call the begin(function, period_microseconds) to schedule your code to run. You can use a float for the microseconds. If it's a constant, it's converted at compile time to the integer actually used by the hardware.

The upcoming audio library will process data in 128 sample blocks, which is approx 2.9 ms at 44.1 kHz. If you start coding now, I'd highly recommend generating 128 sample blocks and plan on 44.1 kHz sample rate, which should make using the library relatively easy.
 
Thanks, I'll have a look at IntervalTimer.

Both buffer size and mixing frequency are something I can easily configure. The buffer is double buffered actually, so that while the interrupt is playing one half of the buffer the mixer routine is filling the other half. If you are interested, I'm actually planning to release the code under BSD-3 License.
 
Status
Not open for further replies.
Back
Top