Teensy 4.0 As a Live Vocal Processor?

Status
Not open for further replies.
I'm not exactly sure at the moment how this would be done, are you looking to slow down the tempo of a piece of music while keeping the same pitch like the Amazing Slow Downer? In the future, probably best to start another thread for this.

Duff,

I appreciate the quick response. I am really interested in real time stretching of voice as opposed to music (but very similar), such changing the rate an audio book on app plays it does not change the pitch only the play back speed, but I am interested in achieving this in real-time. Probably like the amazing slow downer (not sure if that program uses time-domain or frequency domain processing) but again I want it work in real time and I am not interested in huge stretching. With extreme stretching over a long time, there is an additional issue of the data in fourier space building up, so only so much real-time stretching is reasonable. Speeding up the timing in real-time causes it own set of issues. In the past I did this with a Rpi running Csound and was able to achieve minutes of audio stretched by 10's of percent. In this case I really just want to get a second (maybe a few words) of audio stretched at a time and so I thought the teensy 4.0 may suffice.

This is usually achieved by taking the FFT and then before the IFFT is performed the framing can be shifted with time. There are many papers that present different ways to achieve this that are similar, such as the following paper.

https://pdfs.semanticscholar.org/0e4c/1fae5056859a18211510a6d579989d29951e.pdf

In terms of the appropriateness of this subject in this thread, I must say I am scratching my head as to why you feel this way. I am interested in real-time vocal processing via a phase vocoder, which are vocoders that enable frequency domain processing of both playback speed and pitch independently and so I feel like it fits perfectly. If you disagree though, I can start a new thread. Anyhow, I appreciate all the work you did, I think you did a lot of the heavy lifting and so hopefully I can figure it out in the next few weeks between kids/work to finish up the time part of the phase vocoder. Thanks again.
 
In terms of the appropriateness of this subject in this thread, I must say I am scratching my head as to why you feel this way. I am interested in real-time vocal processing via a phase vocoder, which are vocoders that enable frequency domain processing of both playback speed and pitch independently and so I feel like it fits perfectly. If you disagree though, I can start a new thread. Anyhow, I appreciate all the work you did, I think you did a lot of the heavy lifting and so hopefully I can figure it out in the next few weeks between kids/work to finish up the time part of the phase vocoder. Thanks again.
Not that its not appropriate just that you probably be better served starting your own thread about this specific topic.

As far as time stretching it looks like you know what your looking for more or less, maybe this code can help you, it's the original pitch shift code I based mine off and probably more easier understand. http://blogs.zynaptiq.com/bernsee/repo/smbPitchShift.cpp

Currently focusing on adding Format Preservation and single input multiple outputs to my current code base.
 
The basis of time stretching is easy, but implementing it in a way that sounds "good" is not that easy.

Time stretching is based on the repetition of little pieces (grains). This way you "stretch" the audio without changing its pitch. Repeating every grain once you get double the lenght, so you would need to adjust the length of the repeted piece depending on how much you want to stretch it.

The problem is that just doing that it sounds too "clicky". To avoid it you have to overlap and crossfade every transition, and this is the hard part.
 
The problem is that just doing that it sounds too "clicky". To avoid it you have to overlap and crossfade every transition, and this is the hard part.

This conversation is interesting... Is this a similar process to "Anti-aliasing"? If I'm not mistaken, antialiasing for a DAC is the process of "smoothing out" the rhythmic pulse of a digital audio signal. It sounds like time stretching must use a similar method.
 
Yes, both time-based and frequency domain based techniques are not too difficult to code. Time-based approaches can often sound clicky and frequency domain based techniques tend to sound smeary (almost like too much anit-aliasing).

Duff, Thanks for the original code. You are probably right about starting a new thread, but I got enough to chew to keep me busy for a while. I will let you know when I get this done.
 
Status
Not open for further replies.
Back
Top