Audio For Teensy3 - What Features Would You Want?

Status
Not open for further replies.
48 kHz is looking unlikely, at least using I2S that keeps the same rate synchronous with the processor's clock, because Freescale's circuitry just can't generate a 48 kHz MCLK without massive jitter. Perhaps someone will develop an async I2S object?


Paul, have you considered the idea of using an external clock source for MCLK? The Cirrus Logic CS2000 looks like a very good solution, and it requires a maximum of 7 external components. It is also very small, in a 10 MSOP package.
 
I'm currently working on a guitar sound recording/processing project (in a form of a stompbox), so my priorities for an audio module would be:

(please excuse newbie parlance)

- input & output suited for electric guitar (solder pads; suitable gain level for ADC; controllable gain perhaps)
- mono, CD quality
- easy access to storage; recording & playback sampled sound to uSD seems perfect

...actually if I had these 3 features, my project would be ready :)
 
I'm already working on that; in fact, I'm finishing a design based on the WM8740 and the WM8786, with a pair of PGA4311 for volume control, and an input selector with ADG333As analog switches, that activates an LME49724 which converts the unbalanced signal of a guitar to a balanced one, to correctly drive the ADC; there's also a DC-DC converter that converts the USB 5 V to +/- 5 V to correctly drive all of the analog subcircuits. I'm also already working on the USB library stuff, but as Paul said, that stuff is really complex; right now, I've managed to let the Teensy enumerate as a USB compliant microphone, and it also streams some data, although not as it should be; there's a topic I opened for that, if anyone needs more in-depth information.


@Paul: unfortunately, I do not think I have enough time and knowledge right now to handle such a project, especially if you consider it complex. I know this can be quite disappointing to see people contribute to requests and not contribute to code, but it's the sad truth, not enough time...

@MickMad: I checked the thread, thanks for working on this, and good luck, I'm sure this will eventually spark _much_ interest inside the digital audio community. I do not know it this can be of any help, but the LUFA driver includes _some_ code related to usb audio e.g. https://github.com/abcminiuser/lufa/tree/master/Demos/Device/ClassDriver/AudioInput (MIT License for non-commercial)
 
48 kHz is looking unlikely, at least using I2S that keeps the same rate synchronous with the processor's clock, because Freescale's circuitry just can't generate a 48 kHz MCLK without massive jitter. Perhaps someone will develop an async I2S object? Hyple's library has the necessary I2S code. The difficult part is "just" how to integrate it with this not-yet-published audio API stuff. Later this year or early next year, after a least few alpha or beta releases, would be the time to consider that.
Fair enough. (Sorry for the delay in responding, I was flying back from the US to France)

On anti-alias filtering, most I2S DACs add interpolated samples, so all the aliasing is at much higher frequencies. One I have my eye on for an "audiophile interface" board is the Wolfson WM8740, which adds 8X interpolation.
Good point about the anliasing being shifted up by interpolation, I had forgotten about that.
I have seen (but not played with) an audiophile board using the WM8741, the Opus.

The API will be designed for 16 bit samples. I'm sure the desire for 24 or more bits is going to become a frequently asked question. ...

I'm sure we're going to see people regularly asking for 24 bit audio, thinking it would somehow sound better. Maybe it would? If enough people really, really want this, I'll probably work on it eventually. In the meantime, I'm probably going to adapt the I2S code to use 32 bit words (using only the first 16 bits), partly so at least the capability will be there for anyone who wants to take on 24 bits, but mostly because nearly all DACs show this in the timing diagrams.

The only realistic use cases for 24bits are for recording, where 16bits is problematic as an input format for headroom and noise reasons, and the next convenient stage up from that is 24. Much of the hardware is of course only delivering 18 or 20 ENOB (or 16 if you get some really bad 24bit hardware). I record at 48/24 or 96/24. I see the benefit there.

Once all the equalisation and gainstaging is done, the mixed down (and mastered, if you believe in that sort of thing) result is fine at 16bits.

About the "sounding better" argument: I have an audio interface (Presonus AudioBox 44VSL) which handles 4 channels of input and output at 96/24, a good headphone amp (Lake People G109, using balanced connection to the AudioBox) and fairly accurate headphones often used for mixing (Audio Technica ATH M50S). I also have a selection of music (FLAC, lossless) at higher bitrates and sample depths, mostly from Linn Records, up to 96/24. With that equipment I am unable to hear any difference at all between 96/24 and 96/16 or indeed 48/24 or 48/16. Maybe someone with better equipment or better ears can reliably tell them apart.

I already have a DDS sine wave object. Will soon work on a version where each phase accumulator increment is based on the samples from another audio stream, instead of a constant.
That would be good. I know its conventional for a DDS sinewave to have tables for only one quadrant and to use the same tables for the other three; for non-sine uses its usefull to have the option of a full cycle table however.

The ability to use the 16 bit analog input for pots or control voltages sounds pretty nice, but I'm not quite sure how it could work in the context of these Codec chips. They're really designed for only AC coupled signals.
Also a fair point. CV sampling needs DC coupling, obviously. But it can also change at up to audio rates (say 8kHz or so) so the sampling can need to run at audio-like rates.
 
I'm currently working on a guitar sound recording/processing project (in a form of a stompbox), so my priorities for an audio module would be:

(please excuse newbie parlance)

- input & output suited for electric guitar (solder pads; suitable gain level for ADC; controllable gain perhaps)
- mono, CD quality
- easy access to storage; recording & playback sampled sound to uSD seems perfect

...actually if I had these 3 features, my project would be ready :)

For the input part, you need a good opamp and some external components, like resistors for gain setting and capacitors to bypass the power supply noise, and if you want it to work very well a dual supply is needed for the opamp, also; considering the fact that this kind of circuitry is really not needed for everyone, and that it will obviously eat some space on the board itself, it seems difficult that such things will get on the board, because Paul then should eventually add an option to bypass such circuitry, which will need even more space on the board... so, I don't think this will ever make its way to the final product.

BUT, it is very easy to accomplish such a thing by making a little external board that amplifies the guitar level (~300 mV RMS) to the line level of the ADC (1 or 2 V RMS usually, in low power ADCs), with the said components, and then you could easily solder its output to the line level input pads of the audio board.

For the output part, I assume that you would like to plug this stompbox directly into a guitar amplifier; remember that a guitar amplifier expects a guitar level, so you should do the opposite thing as the input, that is lowering the line out of the board ( again, 1 or 2 V rms) to guitar level. Again, this kind of stuff is really not needed in a design which is pursuing the "one size fits all" rule, so it is quite difficult it will get on the final board, the easiest and fastest thing is to DIY and add it to the final board.
 
External MCLK was discussed earlier. TL;DR = The software needs the same rate sync'd to the CPU clock. At least the first version will not support async I2S relative to the CPU clock.

I haven't tried recording to SD yet. Bill did some optimization work on SdFat several months ago to improve write speeds, at least with better quality cards. I'm optimistic recording will be possible, but I'm less hopefully that you'll be able to continuously record and simultaneously play another file from elsewhere on the same card simultaneously.

Eventually I'll work on USB audio. The main difficulty is everything that exists in usb_dev.c is based on 64 byte USB packet buffers. To usefully support isync, a second set of buffer management will be needed for the isync endpoints that stream faster than 64 kbytes/sec.

USB audio support will also require syncing the USB clocked data to the CPU clock. That's the same problem (or very similar) as the external MCLK issue.

Years ago I looked at LUFA's audio examples. In fact, I reported a bug to Dean which resulted in dramatically improved audio output (he still decimates without proper filtering, but 48 -> 32 causes far less aliasing than 48 -> 16), or as Dean put it "ultrasonic whine".

Of course, LUFA uses polled register I/O because it targets 8 bit AVR. On Teensy3, the USB works using DMA and interrupts, without any hardware support for the simple but inefficient I/O method on AVR, so other than the descriptors (which are published in LOTs of places), there's very little in LUFA that applies to Teensy3. The specific issue of managing larger DMA buffers doesn't even exist on LUFA, since the AVR hardware only provides access to the USB buffers as a 1-byte-at-a-time FIFO implemented completely in hardware.

Dean and I also have almost completely opposite coding styles. Dean prefers extensive type definitions and macros in headers that leverage compiler syntax, which is nice in that it results in very verbose and descriptive code, but it also creates a massive number of lines of code and a tremendous amount of "stuff" to learn to simply use it. The descriptions are based on whatever's in those headers, so volumes of extensive documentation are needed. In theory everything is so well abstracted that you only need LUFA's documentation. He organizes things using many files in even directory hierarchy. I prefer a minimalistic style, using mostly primitive types (uint8_t, uint32_t, etc) with structs/typedefs used only where really compelling (eg: my USB descriptors are just arrays of bytes), and with relatively few macros and special names defined, even if that means putting the direct hardware register names right into the code. My goal is usually to put all the code for something into just 2 files, one C or CPP for the code and one header for definitions other code needs to use it. My idea of "readable" code is short enough to fit in 2 side-by-side windows, so I can read most of it without much scrolling. Dean's idea of readable is much more verbose. I just took a quick look... Dean's latest LUFA has a total of 568 header files (1322 files total) nested in 6 levels of subdirectory hierarchy, with 243 headers in the library proper hierarchy, and of course all the code in about twice that many code files. Admittedly, LUFA supports much more hardware and features many more options, but still, it's a very verbose style that I find pretty difficult to digest. Even just reading Dean's USB descriptors requires finding numerous headers to see what all the definitions really do.
 
The main difficulty is everything that exists in usb_dev.c is based on 64 byte USB packet buffers. To usefully support isync, a second set of buffer management will be needed for the isync endpoints that stream faster than 64 kbytes/sec.

USB audio support will also require syncing the USB clocked data to the CPU clock. That's the same problem (or very similar) as the external MCLK issue.

Oh, that's why I can't get the right data to be written on the isochronous endpoint needed to stream audio through USB... well, maybe I'll try to modify usb_dev.c to support different kinds of buffers, you know, those 64 bytes are also very harsh to work with when dealing with 16-24 bits audio :D if you have any suggestion about these kinds of hacks, tell me on the usb audio specific topic, this stuff can get kinda off topic for this one.
 
I'll just chime in to say that, having not used LUFA ever, that coding style sounds like a nightmare to parse and get acclimated to. Whereas all the Teensy related 'low level' (ie, USB descriptors, etc) is pretty easily navigated and quickly understood.

TL;DR: I like your coding chops, Paul.
 
I would just like something where I can efficiently push audio data (8/16 bits, mono/stereo, this should be something you can choose based on your application) to a double buffer on the audio device and that raises interrupt when the end of buffer is reached during playback, so I can hook interrupt to fill more data. Basically what SB16 did back in the days. I would keep all the other crap (USB, SD card, etc.) off from the device and focus doing one thing well, which is audio playback. If you really want, you could maybe add equalizer on the device.
 
I've been thinking recently about the things people will want to actually do with Teensy3 audio.

So far, I've come up with 5 categories.

  • Play audio: pre-recorded music, sound effects. Data is read from media and played without modification.
  • Play musical notes or synthesize sounds, using wavetables, FM, other synth algorithms? Usually at least a pitch/note# and intensity/velocity specify how the note will be played. Most of these applications probably want polyphonic support.
  • Analyze real-time audio: music visualization, beat/rhythm detection, audio user interfaces, others?
  • Modify real-time audio: guitar effects peddles, voice effects
  • Record audio

Have I missed any important applications?

A first software release isn't going to do everything, but I'm trying to look forward to future applications and at least design the APIs in ways that will enable important uses. The last thing I want to do is publish a 1.0 (or 0.1) library and then later have to make incompatible API changes after people are already using it.
 
I've been thinking recently about the things people will want to actually do with Teensy3 audio.

So far, I've come up with 5 categories.

  • Play audio: pre-recorded music, sound effects. Data is read from media and played without modification.
  • Play musical notes or synthesize sounds, using wavetables, FM, other synth algorithms? Usually at least a pitch/note# and intensity/velocity specify how the note will be played. Most of these applications probably want polyphonic support.
  • Analyze real-time audio: music visualization, beat/rhythm detection, audio user interfaces, others?
  • Modify real-time audio: guitar effects peddles, voice effects
  • Record audio

Have I missed any important applications?

A first software release isn't going to do everything, but I'm trying to look forward to future applications and at least design the APIs in ways that will enable important uses. The last thing I want to do is publish a 1.0 (or 0.1) library and then later have to make incompatible API changes after people are already using it.

makes sense to me. especially items 2,3, and 4. as to all things synthesis, perhaps it'll be nice though to keep any midi-or osc type controls/envelope stuff separate from the actual synthesis algorithms. though you can always disentangle things later of course..
 
Unless if you are planning to provide true wavetable support with bunch of memory to load all the samples in, forget #2. I would also advice against Adlib kind of playback support. In fact, all of these playback features you mentioned can be achieved with the simple double buffer with interrupts. Don't fall into DirectX Retained Mode trap that no one is going to use because it sucks. If you want to provide interface with higher level functionality, then provide OPTIONAL additional library for it, and keep the low-level immediate interface that communicates with the device clean from the higher level functionality.

I implemented a music player for Arduino and what's time intensive is A) mixing the samples and B) pushing the data to the DAC with an interrupt. However, you can't provide A without good wavetable support. Having wavetable support with at least 1MB of memory would be nice though, but it complicates your design quite a bit I'm sure. Think of mixing ~32 16-bit samples @ 44KHz with panning support and the way you expose this in the API. Then you need volume, pitch & pan control for all the channels and the way to set interrupts which is triggered at given position for each channel to deal with looping (forward/backward/ping-pong). And if you want to get more complex, think of adding ogg-vorbis decoding support (since mp3 is patented).

Here's a music player I did for Arduino and was thinking of porting it to Teensy. The quality isn't great because on Arduino there's only 32kb of memory for all the audio data (samples, music patterns) & the code, and the DAC I have is horrible resistor DAC. If there was more memory in form of a wavetable, mixing was done off chip and there was proper DAC, the quality would be much better.
 
Unless if you are planning to provide true wavetable support with bunch of memory to load all the samples in, forget #2.

huh, why's that? think Max Matthews rather than Wolfgang Palm? anyways, granted memory will be a limiting factor, as was pointed out by paul above. still, things look better as with AVRs, no? in terms of audio (file) playback, catering to various formats such as ogg vorbis and so on doesn't strike me as particularly interesting, musically speaking. real time resampling capabilities etc would.
 
huh, why's that?
If there's no proper wavetable support what you think you can achieve? Recreating awful Adlib-like device, huh? I want to release uC from doing time consuming audio sample mixing & decoding, thus you need to be able to load custom sample data to the device (potentially MBs of it). OGG decoding is interesting because I don't want to spend time on uC for that, and I don't need various formats, OGG is just fine. I WAS talking about resampling AND mixing those samples AND beyond to have truly useful audio device.
 
What exactly is meant by "proper wavetable support"?

Realistically, 32 note polyphony probably isn't going to be possible using software on a Cortex-M chip. Four might be do-able, maybe even 6 or 8 as the code gets optimized with the M4's SIMD instructions?

Regarding wavetables, so far I've been looking at the SoundFont file format. Are wavetable synthesis sounds commonly distributed in other formats?

Also, on the latest PCB prototype, I added unpopulated placed to solder chips. There's a place to add a 23LC1024 which might be useful for delays up to about 1.5 seconds, and another place to solder a AT45DB161E flash memory (2 megabytes) for wavetable storage. How useful these will really be in practice remains to be seen....
 
What exactly is meant by "proper wavetable support"?
From the top of my head, you should have:
1) enough memory to make the wavetable useful (or at least make it something I can expand myself if it's too expensive off-the-shelf, this is what GUS had back in the days where you could buy extra memory for the card)
2) support for different sample formats (both 8/16-bit PCM is fine, no need for example ADPCM and alike for now, though it's nice of course!)
3) each audio channel should have volume, frequency & pan control for mixing
4) forward/backward/ping-pong looping for each channel
Once I have that, I can control each channel's state to play music, which isn't very performance consuming.
Then you can of course have special stuff, like equalizer/channel, global volume control, etc. but above I would say is the minimum feature set. I don't know how you plan to load samples to the wavetable though.

Realistically, 32 note polyphony probably isn't going to be possible using software on a Cortex-M chip.
Hmh, isn't this ~200MHz 32-bit chip? I'm currently mixing 12 8-bit mono channels at 37KHz on 16MHz/8-bit Arduino, so surely you could do much much better than that! I don't support panning though and only 8-bit samples, so adding panning/16-bit adds some extra cost. Also not sure how fast reading from an external flash is and if you are able to supply enough data for mixing. I guess that's where your bottleneck will be.

Regarding wavetables, so far I've been looking at the SoundFont file format. Are wavetable synthesis sounds commonly distributed in other formats?
I'm playing MOD, S3M, IT & XM formats, which contain music patterns (i.e. notes & effects) and samples. The samples are usually 8/16-bit PCM (signed or unsigned), though IT format has specific compression format, but I just decompress it to PCM. Some formats support also ADPCM and OGG, but I don't care about those as I have never seen mod file using those compression formats. These mod files use 4-32 audio channels. The various effects (e.g. vibrato, arpeggio, portamento, etc.) do not really need anything special from audio device but they are implemented simply by adjusting pitch/volume of each channel.

If you are interested, you can get mod files for example from The Mod Archive and use OpenMPT for playback and investigation of samples/patterns.

Edit: Oh and you should also definitely have linear interpolation in resampling. Adds some extra cost to the mixing routine though.
 
Last edited:
Is there good documentation available somewhere for the .xm or .it file formats? I did some searching and found a lot of history of old MOD software, even Amiga stuff. The closest I found to file format "documentation" was a couple open source mod players.
 
If there's no proper wavetable support what you think you can achieve? Recreating awful Adlib-like device, huh?

easy, ... i was only curious as to why you were so dismissive. you wrote "forget #2", so i was under the impression you were talking about wavetable synthesis, which is a slightly ambiguous term, but obviously you were not. in the direction of #2, i'm pretty sure you can do things other than recreating adlib devices.

that said, i'm still not convinced that "OGG decoding is interesting" in terms of an API. mp3 and aftermath are formats made for transmitting, consuming and selling (or stealing) music, not making it.

anyways, if you want to build a tracker, that's fine with me. i would have thought that's an application though, rather than part of some API.
 
that said, i'm still not convinced that "OGG decoding is interesting" in terms of an API. mp3 and aftermath are formats made for transmitting, consuming and selling (or stealing) music, not making it.
Ok, so you are not convinced that playing music on an audio device is interesting :rolleyes:

anyways, if you want to build a tracker, that's fine with me. i would have thought that's an application though, rather than part of some API.
It's not an API specific to mod playback/tracker, but generic interface for handling audio sample mixing. With practical use-cases you can map what's the generic feature set required for the API.
 
I've been thinking recently about the things people will want to actually do with Teensy3 audio.
  • Play audio: pre-recorded music, sound effects. Data is read from media and played without modification.
  • Play musical notes or synthesize sounds, using wavetables, FM, other synth algorithms? Usually at least a pitch/note# and intensity/velocity specify how the note will be played. Most of these applications probably want polyphonic support.
  • Analyze real-time audio: music visualization, beat/rhythm detection, audio user interfaces, others?
  • Modify real-time audio: guitar effects peddles, voice effects
  • Record audio

If I could abstract that to a block diagram
Code:
    WRITE          WRITE
      ^               ^
      |               |
IN >--+--> PROCESS >--+--> OUT
      |               |
      ^               ^
    READ            READ

then your cases are

  • Play audio: READ -> OUT
  • Play musical notes or synthesize sounds READ -> PROCESS -> OUT
  • Analyze real-time audio: IN -> PROCESS -> WRITE
  • Modify real-time audio: IN -> PROCESS -> OUT
  • Record audio: IN -> WRITE

I would debade whether "most" of the synthesize sounds cases would need polyphony. Some would, some would not. The 'play musical notes' (chiptunes, MIDI to General MIDI audio) would mostly be polyphonic, agreed. But those would probably not need separate polyphonic outs, just a stereo or mono mixdown.

One block flow that is missing from your list is READ -> PROCESS -> WRITE which I guess could be labelled non-realtime audio modification. Or it could be argued that at least one of IN or OUT must be present for it to qualify as audio (it depends on what the facilities are in the processing block and if people want to use that for non-realtime as well).

Oh, modify audio could also be IN+READ -> PROCESS -> OUT (either reading instructions for modification, or outputting audio based on computation on two audio streams (like ring modulation). Depends on whether this is seen as a mono circuit or a stereo circuit or a mono channel where you can add multiple channels (that would affect the processing part as the modules would need to communicate which makes things harder, though also more powerful as you are adding processing nodes?).
 
Last edited:
SoundFonts + Midifile + General MIDI are 3 things I am using.... Polyphony and reverb too... But maybe I am asking too much :)
 
What kind of features SoundFont has that require special features from an audio device with wavetable support?
 
a soundfont file includes many things.
All the WAV files. loop points, instruments mapping and tunning, and many more...
The fact is that you can find many SF2 files in internet.
This will help to not have to Create your own wavetable format and use existing soundbanks, and not have to create yours.
 
Yeah, I'm totally for supporting existing formats, but just wondering what kind of features it has that require HW support (essentially special support from the mixing routine)
 
Status
Not open for further replies.
Back
Top