Faster pitch detection algorithm

Status
Not open for further replies.
I am working on a pitch correction algorithm and I have a very good pitch shift effect up and running (STFT based). The issue I am running into is the variations in pitch are happening faster than I can calculate the adjustment using the NoteFreq code. In other words between updates of NoteFreq, the frequency is varying enough that the pitch correction is not accurate enough. I know the autotune algorithm used by Antares does this in a different way than the YIN algorithm used here. If you want to see take a look at their patent (can be found just searching Antares autotune patent).

My question is: is there a faster pitch detection algorithm I can use, other than the Antares one (which is difficult to implement)? I know my algorithm works for the correction because I tested it with a stable frequency and it corrected perfectly. I really just need a faster detection algorithm and I am there!

Thank you!
 
Maybe try editing AUDIO_GUITARTUNER_BLOCKS in analyze_notefreq.h? Fewer blocks out to give lower latency, but at the cost of being able to detect really low frequencies.

Any chance you might share that good pitch shift effect code?
 
Paul,

I certainly can share it but it is a bit of a mess. I used code found here as well as the granular shift object as a template. I am pretty new to this so it was easier for me to just take an already written module like the granular shift and modify it. Essentially it just runs that code, and ensures the audio data has been re scaled as a float [-1,1]. I set frame size to 128, used 8 overlap (might need to tune this) and 44100 sample rate. I can post the code but I am not sure how people typically do this on the forum and it may be more useful to just check out that code because it is a better starting point than the mess I have.

In terms of the blocks you mentioned, that looks like the silver bullet for solving my issue! Thank you very much.

One thing I was unsure of was the sample rate. I know the exact sample rate is available but I got weird results rounding this so I just used the standard 44100 and seemed to get good results.

Again thank you this is exactly what I needed.
 
Maybe try editing AUDIO_GUITARTUNER_BLOCKS in analyze_notefreq.h? Fewer blocks out to give lower latency, but at the cost of being able to detect really low frequencies.

Any chance you might share that good pitch shift effect code?

Tried you fix and it solved my issue. Thank you! Now I just need to add in some tolerance for spurious errors in the prediction and fix a few bugs. Once I am done I can certainly make the code available but I need to spend some time cleaning it up as well.

Is there a simple way to combine multiple modules from the graphic tool into a single module? I am wondering because it would be cool to have a pitch correction module but my implementation requires the use of the noteFreq module as well as my custom module. Combining them would make more sense here to me.
 
I'm really interested in what you are doing as well. If I may offer a suggestion: Ideally you could make your module output a 0-1 signal corresponding to pitch (ie a note to cv convertor) which could then be "patched" to drive other filters and oscillators. Since you mentioned making a module, then you can process it as a control signal with filters and mixers without writing code.
 
I'm really interested in what you are doing as well. If I may offer a suggestion: Ideally you could make your module output a 0-1 signal corresponding to pitch (ie a note to cv convertor) which could then be "patched" to drive other filters and oscillators. Since you mentioned making a module, then you can process it as a control signal with filters and mixers without writing code.

I am a little confused about what you mean by the note to CV conversion and how that would enable this. Could you provide more details about exactly what you have in mind?

Also let me give more details about what I am doing since you are interested.


Let me break down my approach:

First the algorithm can be separated into two parts. First frequency detection and then frequency shifting. The detection in this case is being done by the notefreq module. In my main loop I check if it is available, and if so take the notefreq output and determine the closest note (for now I am using the full 12 note scale but I am working on adding specific scales). After determining the closest note, I take the ratio of the closest note (lets call this CF) and the actual frequency (F) ratio=CF/F. This values is then set as a parameter in my pitch shifting code, and each block is shifted by this ratio.

The shifting is done using a STFT based shifting algorithm sourced from here.

A few things I have noticed:

1) The algorithm requires the pitch detection be very fast and stable. This is something Antares addressed specifically in their patent for the original autotune (which expired in 2018). Their approach is much different from the Yin algorithm, but the original Yin paper claims and supports that their algorithm is more robust than the auto-correlation based approach Antares approximates in their work. So I am assuming that the Yin algorithm should be suitable for this application, but I am suspicious the approximation method antares uses may still be faster.

2) The pitch shifting is generally in the range of +-10Hz at the most and generally is even less (this will depend on the users vocal range but for me is generally around 3-5Hz). That requires a very accurate pitch shifting algorithm. The 128 sample size used here is restrictive in this case because it limits the frequency resolution of the STFT used here. This is an issue I have yet to determine how to solve, but the obvious answer would be to increase the buffer size. The way I estimate, I will need at least 1024 samples to come close to the accuracy I need, but even more would be better. This creates memory issues as well as latency issues so I will need to put some serious thought into this. There are some limits at play here that are not related to the processing power of the Teensy (resolution vs latency) that I am still trying to rationalize.

Again the original Antares algorithm uses a different approach for both the detection and shifting. I likely will need to implement these eventually but I am not looking forward to that so I am holding out hope I can get things working without that.


Throughout this process I have gained a great deal of respect for the ingenuity of the Antares Autotune algorithm developed by Harold A. Hildebrand. Interestingly enough it was a result of his work in seismic signal processing.

Best wishes!
 
What is the range of the shifting? That could be very useful if, for example, you have an input (like the FREQ input of an osc) with a range of +-1 and this, combined with a tracking range, determines the amount to pitch shift whatever comes in no matter the frequency. The pitch shifting part of your code would be incredibly useful as a stand alone item.
 
What is the range of the shifting? That could be very useful if, for example, you have an input (like the FREQ input of an osc) with a range of +-1 and this, combined with a tracking range, determines the amount to pitch shift whatever comes in no matter the frequency. The pitch shifting part of your code would be incredibly useful as a stand alone item.

By range I am assuming you mean the range it is able to shift. In that case it is up or down an octave. It does this really well actually. The part I am working on is the tracking and synchronizing that with the shift properly. I decided it would not work for my application so I am trying to implement that patent (ugh).

Attached is the pitch shifting code. It may not be exactly how it should be written but it does the job. There is one bug I cannot seem to track down, and it may be a deal breaker in most cases. For some reason as time goes on after uploading this code to the teensy, it becomes very distorted and does not stop until it is re-uploaded. I am wondering if there is perhaps memory leak somewhere. Spent a lot of time investigating this issue with no luck. Since I have shifted my focus to implementing the patent, I still have yet to fix this.

View attachment RepitchExample.zip
 
By range I am assuming you mean the range it is able to shift. In that case it is up or down an octave. It does this really well actually. The part I am working on is the tracking and synchronizing that with the shift properly. I decided it would not work for my application so I am trying to implement that patent (ugh).

Attached is the pitch shifting code. It may not be exactly how it should be written but it does the job. There is one bug I cannot seem to track down, and it may be a deal breaker in most cases. For some reason as time goes on after uploading this code to the teensy, it becomes very distorted and does not stop until it is re-uploaded. I am wondering if there is perhaps memory leak somewhere. Spent a lot of time investigating this issue with no luck. Since I have shifted my focus to implementing the patent, I still have yet to fix this.

View attachment 21642

I Actually got the blazing fast frequency detection from the original autotune working, however my pitch adjustment is still not working. Once I can fix that bug I believe I will have a functional autotune.
 
Status
Not open for further replies.
Back
Top