Teensy 3.0/3.1 Audio Library with sound pitch control

Status
Not open for further replies.

Japala

New member
Hi.

The new audio library offers a lot of functionality but one thing that I'm missing for my current project. I went through the features offered by the library and failed to find any mention on the sound pitch control. Would this be something that could be easily implemented to the library or have I just missed this feature / failed to notice a function that offers the similar end result?

I would love to implement a sound module with pwm input to control the pitch of a sound sample. Sample could be engine sound, general machinery etc... I'm pretty confident that I can manage all the other aspects of this build but this crucial part, the sound pitch is missing. Any help?

-Jani
 
Last edited:
Currently this only exists in the waveform synth object, where you can load an arbitrary waveform. It's your responsibility to make sure the data you load is bandwidth limited, for however much you intend to speed it up, so you don't get horribly aliasing.

If you can create 256 sample waveforms, with appropriate bandwidth limiting, this should work great for basic waveform pitch shifting.

Implementing a more complete SoundFont player is on the development roadmap.

http://www.pjrc.com/teensy/td_libs_AudioRoadmap.html
 
I would love to implement a sound module with pwm input to control the pitch of a sound sample. Sample could be engine sound, general machinery etc... I'm pretty confident that I can manage all the other aspects of this build but this crucial part, the sound pitch is missing.

i'm assuming you're having in mind .wav files? at any rate, as was mentioned, that's not possible atm.

if by "control[ing] the pitch" you were thinking "varispeed", that'll probably happen sooner or later. if you actually meant pitch "shifting" -- unlikely, at least that's an entirely different ballpark. one standard way of doing it involves an FFT and iFFT, and that's probably not doable, at least it doesn't sound trivial (from said roadmap: "The 1024 point FFT consumes 52% CPU time on 1 of every 4 audio updates. On the other 3 of 4 updates, it merely stores the incoming audio, to be analyzed on the 4th update." )

i'm guessing SoundFont is more suitable in this context than trying to do it real time (though using them seems tedious, and it'll be limited to midi pitches) - which editor are you having in mind though, paul? swami?

the roadmap also mentions wavetable synthesis -- is this going to be a separate development? ie support of complex/bandlimited wavetables (rather than single cycle?) if so, which format were you aiming at?

i'm asking because of, well, editors - there doesn't seem to be one that's both decent and cross platform. what seems to be the most active / fully featured one unfortunately is windows only (though playing ok with wine); it supports a great many formats: classic .WT (terratec komplexer), .WAV (composite/with loop points), transwave (ensoniq), .SYX (blofeld), .WTX (PPG) etc -- http://www.youtube.com/watch?v=Pifmp8wqeEQ
 
The sample would be in wav format (can be something else but lets go with this for now). I believe the technique is called pitch shifting (like Paul noticed) and there are some good (and bit too complex for me) articles on the net with the required math. Many sites seem to refer to this one: http://www.guitarpitchshifter.com/pitchshifting.html . FFT is the beast that one needs to tacle on this it seems.

That idea with a list of x-number of samples from the same sources but pre-shifted with a computer seems doable... Something I have to try... Only thing is that the clips need to blend nicely when changing from one to another and there should be no "clips, chirps" etc. when looping the selected sample... Seems a lot of work to prepare everything thus making the math solution much more desirable. That is if it doable at all. :)
 
Only thing is that the clips need to blend nicely when changing from one to another and there should be no "clips, chirps" etc. when looping the selected sample...

The fader object is meant for smoothly transitioning between 2 sounds. Run each sound through a fader, and add their 2 outputs with a mixer. Call the function to make one fade out, and the other fade in, both over the same length of time. They have an algorithm where both magnitudes add to 1.0 if used at the same time, same speed.
 
i'm porting a pitchshifter to teensy audio library at the moment ;)

It'll be quick and dirty but it should work ok.
 
i'm porting a pitchshifter to teensy audio library at the moment ;)

Is it based on that algorithm with 4 overlapping 1024 point FFT and inverse FFTs?

The ARM math lib FFT takes about 52% CPU time in an audio update, so in theory every 1024 samples, 4 normal FFTs and 4 inverse FFTs might be able to be staggered to get that algorithm to run on Teensy 3.1. Maybe?
 
Further to my previous response, my intial goal is to solve the problem - the lack of a pitchshifter - by any means necessary. In order to do this I will not be looking at ideal algorithms. This problem has been solved hundreds of times by tracker software, which had to make do with scant CPU power.

On my old 16bit 286 16Mhz I used be able to play 16 channel S3M files all of which applied software pitchshifting with a fraction of the CPU power available to the Teensy, admittedly on lower bitrate samples but never the less, it worked fine. So if my porting attempt is not successful then I will look to the solutions of yesteryear :).

This was not with a Gravis Ultrasound or other hardware mixer/dsp soundcard, we are talking SoundBlaster 8bit and 16bit editions, pre AWE models.

But why reinvent the wheel when a guy has got it working on a lesser ARM chip already?

Found a great link though:
http://www.guitarpitchshifter.com/

Reading through it provides a great basic overview of what we are trying to achieve.

I predict we will need a Q setting for the pitchshifter to balance quality or speed, as there are many ways to skin this particular cat.

So this will be the start of many revisions no doubt.
 
Last edited:
Further to my previous response, my intial goal is to solve the problem - the lack of a pitchshifter - by any means necessary. In order to do this I will not be looking at ideal algorithms. This problem has been solved hundreds of times by tracker software, which had to make do with scant CPU power.

On my old 16bit 286 16Mhz I used be able to play 16 channel S3M files all of which applied software pitchshifting with a fraction of the CPU power available to the Teensy, admittedly on lower bitrate samples but never the less, it worked fine. So if my porting attempt is not successful then I will look to the solutions of yesteryear :).
It is amazing to consider how well trackers did it in the 80s with such tiny amounts of computing power. But I believe that was largely due to the special offboard sound chips things like the AMIGA had - very little was done on the CPU.

For a basic pitch shifting implementation, check out the Nootropic Audio Hacker project. He's using a low end ATMEGA Arduino with offboard ADC and DAC, so he's got very little power - so he's doing granular synthesis of a sort - totally ignoring any fancy FFTs to do it cleanly. The audio samples he shows sound surprisingly not awful considering, but that might be the 8-bit audio masking the glitches of not pitch shifting with all the FFT maths like you should. https://nootropicdesign.com/audiohacker/projects.html

Is there a Teensy lib to access the Cortex M4s onboard DSP hardware?
 
Pc didn't have any hardware to assist until the Gravis Ultrasound came around. :)

Even then you're looking at hardware mixing only

impulse tracker could even play through your pc speaker. Sounded awful.

I'll look at his code and have a nose :)
 
Last edited:
I've just remembered. Those trackers didn't pitch shift at all, they re sampled to the note. I recall the samples played shorter at the high notes. Please forgive me, it was 20 plus years ago.

That requires a surprisingly complicated solution as the audio library runs at one rate only. I'll have to create a stream using a sample as the source.
 
Last edited:
Is there a Teensy lib to access the Cortex M4s onboard DSP hardware?

Well, that's pretty much what the Teensy Audio Library does.

The DSP instructions are accessed using the inline functions defined in this file:

https://github.com/PaulStoffregen/Audio/blob/master/utility/dspinst.h

For an example of how this works, look at the Biquad filter.

https://github.com/PaulStoffregen/Audio/blob/master/filter_biquad.cpp#L53

The honest truth of the Cortex-M4 DSP instructions is they take quite a bit of planning and work to use properly. ARM doesn't say that in their marketing materials, but unless you want a specific math operation that's already optimized in a library, you have a lot of work to do.

You have to surround them with code that brings pairs of 16 bit samples into 32 bit words. This effectively doubles the register storage, and it lets you take advantage of the M4's optimized load and store of successive 32 bit words, and reduced looping overhead. Some of the operations are very convenient too, like multiplying 16 * 32 bits for a 48 bit result, and then discarding the low 16 bits, which consumes only a single 32 bit register for the output.

Optimizing code with these instructions is all about planning how many registers will hold inputs and output and other stuff. It's very low-level stuff requiring quite a lot of thinking about registers and clock cycles. You can get quite a bit of speedup this way, but it's not simple or easy. It's quite a lot of work to optimize code. Many parts of the library are written to leverage these instructions. The FIR filter and FFT from ARM's math lib use them, and I used them in many places in the library like the biquad filter, noise generators, state variable filter, mixer, etc.

There's no simple and easy way to automatically use the DSP instructions. It's always involves lot of careful planning and optimization work.
 
The truth is, for most projects the Teensy/Audio will be used for, there is more than enough CPU time without further optimisation. I haven't even considered clocking the Teensy up yet but it will do almost double what it runs at by default.

The mixers are extremely well implemented, I was most impressed during my testing, at how many samples can be mixed in realtime from sketch, and how little it impacts the CPU consumption. This leaves plenty of headroom for the other non-optimised functions;

Most projects will work just fine, guitar pedals, mini synths (yet to be tested, but I would guess even 8 note polyphony is not out of reach), drum machines, DJ machines, loopers, voice changers, FX boxes, all of this is highly unlikely to reach the limits of the CPU. IT's only when you try and push the hardware further than it is intended to go (and push you should!) where you'll start hitting hard limits from the kit.

SDCard access is always going to be the biggest issue but again, not a problem for simple projects. This is the very next thing on my list, I'm doing a breakdown of how many WAVs can be streamed from the SDcard in unison, in each of the formats supported. As far as I can tell from the code, I can use 16bit WAVs, at 44100, or 22050 khz in both mono and stereo flavours, I've made the files up - and their on the SDCard just need to try opening them up now. An evenings fiddling with an excel file, might reveal up to 4 or 5 16bit 22khz mono files can be played simultaneously, mixed, with effects and envelopes. Considering Roland used to release professional audio products with 4 voice polyphony which used 6 and 9bit bitdepths, this is enough to make some really amazing projects, which sound great.

I think this is a testament to Pauls optimisations within the core functions - it's also nice to know that there is room for improvement in a lot of other areas. My approach at the moment is to provide the functionality, optimisation can be done at a later stage, if it is ever found to be necessary.
 
I am wondering if a somewhat small modification to the SD playRaw .cpp could allow for pitched playback? Or at least reverse/half speed/double speed playback. It appears to me that the update function in the library has a variable "n" which is set to the number of bytes returned by the SD read, and is incrementing through the file_offset, could this be a matter of just adding an if(mode=reverse) that switches to decrementing -= to do reverse, or could /2 or *2 the "n" variable that is driving the offset movement? For that matter, could "n" just be multiplied by a signed float to drive its speed/direction of advance through the SD file?

Code:
void AudioPlaySdRaw::update(void)
{
	unsigned int i, n;
	audio_block_t *block;

	// only update if we're playing
	if (!playing) return;

	// allocate the audio blocks to transmit
	block = allocate();
	if (block == NULL) return;

	if (rawfile.available()) {
		// we can read more data from the file...

		n = rawfile.read(block->data, AUDIO_BLOCK_SAMPLES*2);  // <-- COULD THIS BE /2 for double speed and *2 for half speed?
                                              //   or possibly even a float multiplier for varispeed?

		file_offset += n;    // <-- COULD THIS BE A -= version for reverse? 
                                      //  or if the "n" was multiplied by a negative float would it reverse?


		for (i=n/2; i < AUDIO_BLOCK_SAMPLES; i++) {
			block->data[i] = 0;
		}
		transmit(block);
	} else {
		rawfile.close();
		AudioStopUsingSPI();
		playing = false;
	}
	release(block);
}
 
Last edited:
i don't think it's that easy. read does return the number of bytes read (n), but file_offset merely seems to track the position in the file, and is used by positionMillis(void). there's no increment (if i see things right), just the call to rawfile.read(..).


for varispeed one way to go about it would be use an intermediate buffer to fill up "block->data" rather than doing it straight from the SD card. you probably could use a phase accumulator à la the synth_sine.cpp etc classes to move through that buffer.


(so is it macromachines as in 'storage strip'?)
 
Hahaha indeed! So I assume you are a eurorack user? How might I ask do you know of our first product? We are about to announce a full line of upcoming modules at NAMM a month from now, and I am exploring the teensy as a fantastic platform for developing some of them. Our big thing of course is having a digital back end protocol driving all of the modules at once, so you can save your patch cable routines with our dynamic destiny switch, mix or attenuvert them with our flux fusion mixer, sequence them with our.... Well you get the picture...

I decided it might be a good thing to actually have some sound generating and processing modules to show at NAMM, As our first focus was mainly cv and routing so that people could add storage to their existing systems.

Back to the topic at hand, you may be correct, and I think my comments in the code don't translate well in the browser so it's hard to see, I am referring to this line:
file_offset += n;

Maybe that is just keeping track of the position and not setting it? I am still not entirely savvy to the way some of the methods in this library work.
 
as far as i can see, file_offset just gives you access to the position from within the library; it doesn't set the position, most of that is hidden from view (take a look at SdFile.cpp); (for similar reasons, ie the way SD.h etc works, i think "reverse play" would be tricky without adding to stuff to the SD library and even then it might not be doable very efficiently; but i might be wrong).

and yup, i am. i think i must have seen it on muffwiggler. cool .. will be curious about the expanded module range. so i figure some sampler thing is one of them...
 
If it is possible, it will be at a great performance hit.

SDCards excel at sequential access, but don't do so well when you are hopping around. You'd at least need to work in full blocks, and then reverse them.

A better solution (at least for different playback speeds) would be to stream into the 16Mb SPI ram and process the data from there. The same streaming solution could easily support reading in reverse blocks efficiently.

Edit: Having recently actually written data to the flash chip, this would not work, its way too slow to write to. But it reads really fast!
 
Last edited:
I'm not sure how much help this will be, but the talk about S3M and MOD trackers reminded me of the following Arduino and Teensy projects - maybe their code will help with some pointers on how to get pitch control working?

Teensy 3.1 MOD player: https://code.google.com/p/arduino-music-player/ (video: https://www.youtube.com/watch?v=zFMUmE5CdrQ )
This player uses a PC-side tool to convert the S3M files into a compressed format, however I think that the samples remain untouched and only the tracker information is manipulated.

ATmega644 MOD player - http://www.elektronika.kvalitne.cz/ATMEL/S3Mplayer/S3Mplayer_eng.html (video: https://www.youtube.com/watch?v=j6ijbexoq-M )
One of ATMega AVR players.

Thank you for working on this Pensive, I've been looking for a pitch-control capable library for Arduino/Teensy for over a year now. :)
 
Last edited:
I'm not sure how much help this will be, but the talk about S3M and MOD trackers reminded me of the following Arduino and Teensy projects - maybe their code will help with some pointers on how to get pitch control working?

Teensy 3.1 MOD player: https://code.google.com/p/arduino-music-player/ (video: https://www.youtube.com/watch?v=zFMUmE5CdrQ )
This player uses a PC-side tool to convert the S3M files into a compressed format, however I think that the samples remain untouched and only the tracker information is manipulated.

ATmega644 MOD player - http://www.elektronika.kvalitne.cz/ATMEL/S3Mplayer/S3Mplayer_eng.html (video: https://www.youtube.com/watch?v=j6ijbexoq-M )
One of ATMega AVR players.

Thank you for working on this Pensive, I've been looking for a pitch-control capable library for Arduino/Teensy for over a year now. :)

I've currently shelved the pitchshifter for the next month or two at least, sorry. Getting carried away with my project and with Teensy-LC beta testing, and creating a studio in the garage, and work, and young kids =D



It's actually possible to use the waveform object to pitchshift already, but i've not tried setting it up yet. look at the "arbitrary waveform" function of a waveform, load your sample into the array there (maximum 256 samples), and set your playback frequency, then every 5.8049792 milliseconds load the next 256 bytes of the audio. OR something like that. ( (1000ms / 44100) * 256 = 5.8049792 )

Paul mentioned this to me ages ago in the Audio thread but it seemed a little mysterious at the time. It's not of course, it's a very simple way of achieving the goal. I just haven't got around to it.

Better would be to hack the code so you just modify the the array pointer each time instead of moving data. You might be able to point it at the integral audiobuffers of the Audio engine, this would work for WAV,RAW, and indeed potentially any sound source, I'm guessing, with minimal cpu overhead. But I'm really just brainstorming here. I might be full of it! It wouldn't be the first time.
 
I've currently shelved the pitchshifter for the next month or two at least, sorry. Getting carried away with my project and with Teensy-LC beta testing, and creating a studio in the garage, and work, and young kids =D



It's actually possible to use the waveform object to pitchshift already, but i've not tried setting it up yet. look at the "arbitrary waveform" function of a waveform, load your sample into the array there (maximum 256 samples), and set your playback frequency, then every 5.8049792 milliseconds load the next 256 bytes of the audio. OR something like that. ( (1000ms / 44100) * 256 = 5.8049792 )

Paul mentioned this to me ages ago in the Audio thread but it seemed a little mysterious at the time. It's not of course, it's a very simple way of achieving the goal. I just haven't got around to it.

Better would be to hack the code so you just modify the the array pointer each time instead of moving data. You might be able to point it at the integral audiobuffers of the Audio engine, this would work for WAV,RAW, and indeed potentially any sound source, I'm guessing, with minimal cpu overhead. But I'm really just brainstorming here. I might be full of it! It wouldn't be the first time.

Any luck getting the pitch shifting to work in the past few months? I just got a project that will need me to get pitch shifting a live input from the mic for playback to the audio out. I could also record a wav, and play it back pitch shifter.

Do you have any work done so far that I could take a look at?

I am gonna try the arbitrary waveform function to see if I can get that to do pitch shifting as you suggested.
Thanks!
T
 
Last edited:
I am gonna try the arbitrary waveform function to see if I can get that to do pitch shifting as you suggested.

Short of digging into the library and crafting your own audio processing code, that's the closest thing.

If you are going to play the waveform faster than 1 of its samples for each audio library sample (a net speedup), you need to pre-filter your waveform before loading it into the arb waveform buffer. Otherwise, you can get terrible aliasing.

For example, consider if your arbitrary waveform has bandwidth from 20 Hz to 22 kHz when you give it to the waveform object. Then you play it 40% faster. If this were analog, like playing a tape faster, your signal's bandwidth would become 28 Hz to 30.8 kHz.

But this isn't analog. It's digital. Digital sampling has a Nyquist limit of half the sample frequency. When you try to increase the speed, the portion of your original signal from 20 Hz to 15.7 kHz will become 28 to 22 kHz. However, the part from 15.7 kHz to 22 kHz will alias and become terrible distortion, from 13.2 kHz to 22 kHz. That's how digital works!

To do this properly, you must create or filter your waveform so it doesn't have any content above 15.7 kHz (for this example of a 40% speedup). If you're going to change the speed over a wide range, perhaps you'll make several filtered copies, so you can use the ones that still have more of the original frequency content for modest speed increases, and less for the times you need to massive speed it up. This may seem like you're losing something, but you're not (assuming good filters). The stuff you're removing with the filters is the content that would have become inaudible ultrasonic frequencies.... if it were speed up by analog means. In digital sampling, you *must* remove that stuff before you try to speed up the waveform.

Slowing down is easier. In extreme cases you might generate a lot of extreme bass or sub-audible tones, but there isn't a Nyquist limit problem.
 
Last edited:
Short of digging into the library and crafting your own audio processing code, that's the closest thing.

If you are going to play the waveform faster than 1 of its samples for each audio library sample (a net speedup), you need to pre-filter your waveform before loading it into the arb waveform buffer. Otherwise, you can get terrible aliasing.

For example, consider if your arbitrary waveform has bandwidth from 20 Hz to 22 kHz when you give it to the waveform object. Then you play it 40% faster. If this were analog, like playing a tape faster, your signal's bandwidth would become 28 Hz to 30.8 kHz.

But this isn't analog. It's digital. Digital sampling has a Nyquist limit of half the sample frequency. When you try to increase the speed, the portion of your original signal from 20 Hz to 15.7 kHz will become 28 to 22 kHz. However, the part from 15.7 kHz to 22 kHz will alias and become terrible distortion, from 13.2 kHz to 22 kHz. That's how digital works!

To do this properly, you must create or filter your waveform so it doesn't have any content above 15.7 kHz (for this example of a 40% speedup). If you're going to change the speed over a wide range, perhaps you'll make several filtered copies, so you can use the ones that still have more of the original frequency content for modest speed increases, and less for the times you need to massive speed it up. This may seem like you're losing something, but you're not (assuming good filters). The stuff you're removing with the filters is the content that would have become inaudible ultrasonic frequencies.... if it were speed up by analog means. In digital sampling, you *must* remove that stuff before you try to speed up the waveform.

Slowing down is easier. In extreme cases you might generate a lot of extreme bass or sub-audible tones, but there isn't a Nyquist limit problem.

Thanks for the note, Paul. I am thankfully not speeding up the sound. I need to take a 24khz - 26khz sound and pitch shift it down to an audible range, maybe 12-13khz, or 6-7.5 khz if there aren't too many artifacts in the time stretch

I am trying to test using 27khz frequencies but I either am having a hard time generating a frequency or I am having hard time sensing it with the teensy. I have to test a sound generation device that I don't have in my hands just yet. I will be ordering a few of these ultrasonic transducers to see how they fare with the teensy mic in.
http://www.mouser.in/ProductDetail/Kobitone/255-250SR16P-ROX/?qs=sGAEpiMZZMvtrnhC60i%2bOsZHfFLa1IkJ
 
Status
Not open for further replies.
Back
Top