Point Triangulation from Sound with high accuracy

Status
Not open for further replies.

Tootsie

Member
I am in the middle of a project where I am trying to triangulate the source of a noise. I am going to use a teensy 4 for this project. I do not care about audio quality, I just need quick response to a noise over a certain threshold. I will use the teensy 4 to log the time difference between each microphone picking up the noise. I likely will not need interrupts as the maximum time between mics responding is about 1ms, so sampling should be fine. The first microphone to detect a noise will start a time and then when all the microphones "hear" the noise or a 1-2ms passes, the times will get passed to my raspberry pi to process the information.

The overall mic-mic distance is relatively small ranging from about 280-400mm apart. Due to this, I need very high accuracy.

I am planning on using 4 LT1721 comparators (10 ns delay) along with 4 electret microphones for the triangulation.

I have read up a bit on electret mics and op amps and I have found that a high slew rate might be what I am looking for to reduce delay. Are there any other parameters that I should be interested in. I am capable of creating my own circuit, but for now would prefer to buy a mic/OpAmp combo.

Along with this, what would be the recommended way to store data as to not lose time? I have read up on creating a buffers, but I am not sure how much data the teensy can store and the rate at which I will be sampling could turn into a lot of data if the memory of the Teensy 4 is very low.

From my understanding with a 600Mhz processor, the max I could sample is at 300Mmz. Using 4 mics and comparators that would put me at 75Mhz. I believe this would be much faster than the on board ADC could handle.

I am new to Teensy and new to audio sampling in general so any information is very helpful.
 
Hi Tootsie:

I am a bad coder and not an expert in microcontrollers at all, but I did a sound source ranging device based on a Teensy 3.2 some years ago.

Firstly, I think you cannot assume a sampling rate based on the CPU frequency of the Teensy 4 alone. I may be wrong but I think that the max sampling rate at 12 bit resolution is 1.4 Mhz for that board.

On the Teensy 3.2, using the excellent Pedvide's ADC library I was able to get over 500 ksamples/sec at 8 bit resolution on four signals (1+ Mhz per channel). I used 4 electret mikes from Sparkfun on a square array and was able to get half a cm resolution at 1 mt distance.
At that sample rate I was limited to a max distance between mikes of 30-40 cm if I remember correctly, but I learnt that besides a faster sampling rate a bigger distance from each microphone would dramatically enhance the ranging measurement. In both cases you will need a bigger buffer though.
I read the ADCs directly into a buffer, normalized the four signals, filtered out everything below a certain treshold, and then crosscorrelated them to get the time difference of arrival. I used a multilateration algorithm to get the ranging solution.

Hope this helps, Juan
 
Hi Juan,

Thanks for the response. Is there anyway you would be able to share the code you wrote for that? It is seemingly very close to what I would be doing and I am also a bad coder.

If not, was the teensy 3.2 able to deal with the memory or did you have to add an sd card?


I am worried about creating a buffer because it might be minutes before a signal is seen. After the signal is seen by the first mic, then it will quickly be seen by the next ones.
I am only triangulating on a planar face, and also used multilateration equations. I have 4 mics but only really need 3 for the math, should I use the first mic that hears the noise as an interrupt and then use the data from the other 3? Or is there a way that I am not thinking about to make sure the data does not overflow.

Also, does anyone know the bit depth of the Teensy 4 ADCs? I cannot find it on the website. Also I would assume that using both on board ADCs in parrallel would allow me to sample faster as well. I just downloaded the Pedvide library and am looking through it now.
 
I am sorry to say that my code got lost with one of my old laptops... and the hardware I used was also repurposed.

Everything was done using a single Teensy 3.2, no additional memory or sd cards were used. I was also putting the sound source on the mikes plane for the sake of simplification. I know that you only need 3 microphones, but if you use two channels like I did (connecting two mikes per ADC channel as you mention), the fourth microphone is free in terms of signal processing I guess.

Teensy 4 should be even better at it. First, try to see what the maximum sampling rate per channel is. I just read that Pedvide's ADC library is now compatible with Teensy 4. Use it.
Then, decide the size of the microphone array, measure the maximum distance between the mikes and set the buffer size in accordance to the number of samples it will take until it reaches the last microphone. You should take a headroom of four to six times this figure to set the buffer size and you should be safe.
In order to get better results take into account the speed of sound at your altitude. In my case, living close to a mountain range (1000 mts) I had to correct that figure a bit.

Regards, Juan
PS: I think I will order some Teensy 4s and some piezo mikes to try it myself again ;)
 
Ah too bad, but thanks for the information it will be very helpful. All of this makes sense. The one thing I still would like to figure out is how to start sampling. If I used all 4 microphones, how would I make sure that each microphone saved the data in the buffer at the time I wanted it to read.

For instance, it might be a minute or more before any noise is made. I would not want to be saving data for that full minute, just when the first microphone "hears" the noise, then start recording to the buffers. I just wouldn't want to lose data do to not recording. Would I just look for voltage changes and when a certain threshold is hit, start recording?
 
I think I used continuous analog read as Pedvide himself suggested, so you don't have to worry about triggering (hope you don't run on batteries).

===Quote===

Start a continuous measurement on both ADCs using pins A2, and A3 (for example) in the setup already.
In the loop:
- store the values value1,value2.
- start a normal measurement on pins A11, A10 and get the value
- start a continuous measurement again on pins A2, A3.
- process data.

---End Quote---

You should find the details browsing through Pedvide's ADC library docs and examples as I did, hope this helps

regards, Juan
 
You shouldn't use comparators or interrupts at all. Just use the ADC input(s) and sample the audio. Hook the microphones to the sound-in circuitry and use the Audio library to sample it.
You should know the exact distance between the microphones, because you can't get an accurate direction sideways unless you know this. (propagation is about 1 ft per millisecond)
Finally, the way to find sound is to do an autoconvolution between the two channels, to find the phase difference, and the phase delay you find between the two channels of audio, will tell you the bearing towards the sound source.
 
Agreed, audio library.

Also, consider using the teensy audio shield.

Also, if you've never done it before, don't underestimate the challenge of learning the signal processing to actually detect your signals and to measure the time delay (the filtering and cross correlation).

If you're just trying to make a simple demo for a class or whatever, you could assume that you're only direction-finding to super loud sounds that have a super sharp onset (like a strong handclap). In that case, you could just set a loudness threshold and measure the sample # (which is time) when the threshold is crossed from quiet-to-loud. The threshold-crossing time should be diff for the two mics, which should relate to the diff the the sound's travel time, which should scale with the diff in travel distance.
 
After lots of research, I've come to the same conclusion you both have for using the ADC and cross correlating each signal. I took a relatively extensive signal processing engineering class 6 years or so ago in collage so I am at least familiar with the ideas of everything going on.

As of right now I take in the audio in all 4 microphones. When the sound crosses a noise threshold sampling begins to fill a buffer on all 4 microphones. When this is done, I normalize, then cross correlate the microphone that first heard the noise with the other 3 microphones. This gives the delay of all 3 microphones which corresponds to a time.

I am now looking to filter the noise in order for the cross correlation to be more accurate. Below is the frequency response of the signal during the time frame of which I would be sampling. I was thinking that putting a band pass filter on the microphones from 210-310hz would help when cross correlating the two signals.
wavepad_5NTJJYWXe2.png

Also I was thinking there could be a way to convolve each the signals with the original signal and then cross correlate to make this even better.


I ordered the audio shield and will take a look at how that can help me as well.
 
When the sound crosses a noise threshold sampling begins to fill a buffer on all 4 microphones.

It may be better to just run the sampling all the time, and when the noise crosses the threshold, wait for a small amount of time, and copy the data you need out of the buffer. That way, you get a copy of the data that happened before the threshold, and you don't need to wait for the ADC to spin up.

I was thinking that putting a band pass filter on the microphones from 210-310hz would help when cross correlating the two signals.

Which specific frequency band you pay attention to depends entirely on your specific application. If you think 210-310 works for you, then have at it!

I was thinking there could be a way to convolve each the signals with the original signal and then cross correlate to make this even better.

Isn't that what cross correlation does in the first place? Convolve with a shifting window, and figure out which phase gives the highest result?
 
It may be better to just run the sampling all the time, and when the noise crosses the threshold, wait for a small amount of time, and copy the data you need out of the buffer. That way, you get a copy of the data that happened before the threshold, and you don't need to wait for the ADC to spin up.

How would I create a buffer to do this? It is possible that I could be sampling for minutes before a read which at 400KSPS that would be an insanely larger buffer. Is there a way to get around that?

Also, there are only 2 ADCs and 4 mics so I am pretty sure I constantly need to change which pins are being read by the ADC which slows me down a bit, but I do not believe there is a way around that.

Isn't that what cross correlation does in the first place? Convolve with a shifting window, and figure out which phase gives the highest result

Yes you are correct, in essence I am making a matched filter with this. I was just wondering if there is any other kind of filtering that could be done to make the time delay calculation more accurate. One thought as I mentioned earlier was using a bandpass filter before cross correlating(matched filter). If there is something else that could be done, I would love to hear about it.

With the bandpass filter, since i am sampling so quickly and for a limited number of time, is there a certain amount of signal periods I would need? for instance at 200HZ and a buffer size of about 5ms, that turns into 1 period of that frequency. Would this be a problem?
 
If your target signal is only 210-310 Hz, sampling at 400 kHz does not get you better accuracy. For any reasonable signal-to-noise ratio, you're not going to do better than 1/2 to 1/4 of a wavelength, so sampling a 2-4 times the Nyquist requirement (so, 2x(2*310) = 1240 Hz, or 4x(2*310) = 2480Hz) will be just as good. So, at a slower sample rate, your buffers will be way smaller.

Alternatively, if you use the Audio library, you're fixed at 44100 Hz, which will resurrect the question of buffering.

Chip
 
I struggle to see how sampling more often will not get me better accuracy. If I sample a 500hz sine wave through 2 different microphones and sample it at 1000hz and then cross correlate them to find the the lag. The best accuracy for time delay I could get would be 1ms because that is how often I am sampling. If I sampled that at 100Khz then the lag accuracy could theoretically be 10us.
 
This is a question of resolution vs accuracy. Having a higher sample rate will give you an answer with more decimal places (ie, higher resolution), but those extra decimal places are not reliable or repeatable. The problem is noise. Noise destroys your ability to super-resolve a signal. If there were no noise, a higher sample rate does allow you to better resolve. With noise, any additional resolution is false.

There has been a ton of work on ranging over the decades...the two primary limits on resolution are the signal's bandwidth and the level of competing noise. You then choose your sample rate to be no worse that the limits imposed by the bandwidth and the noise.

In your case, your bandwidth is 310Hz minus 210Hz = 100 Hz. This implies that your basic unit of resolution will be something that scales with 1/100th the speed of sound. The actual resolution that you achieve scales from this number based on how much noise is present (the signal-to-noise ratio, SNR).

For really high background noise (low SNR), you'll be comparing two signals that are highly corrupted by noise. Think signals that are nearly unrecognizable. When do they line up well? It's hard to tell! With a proper matched filter or cross-correlation, your resolution will probably be within a factor of 2 of this "sound_speed / 100" limit.

As you get less background noise (as the SNR improves), you'll be comparing cleaner and cleaner signals. Cleaner signals allow you to better resolve the best alignment of the two signals. This will definitely improve your resolution over this basic "sound_speed / 100" scale factor. But, "sound_speed / 100" is still the benchmark for judging how much improvement might be possible. With moderately clean signals, can you make the resolution 10x better? Yeah, probably. How about 20x better? That'll be tough. Can you make it 100x better? Probably not. At least, not with signals recorded in the real world.

To support 10x better than baseline, you would need a sample rate of (2x310)x10 = 6100 Hz.
To support 100x better than baseline, you would need a sample rate of (2x310)x100 = 61000Hz.

Hence, I think that 400 kHz won't contribute any benefit over a slower speed.

You can do whatever you'd like...because this is totally a fun hobby and you've got a great project here. Having fun is the goal. If you having your system crank along at 400 kHz is what makes this fun, absolutely go do it! Alternatively, if goal is to get a working system with less frustration, that fast of a sample rate will probably add to the frustration. If so, you could consider dropping your sample rate without any loss in ranging resolution, IMO.

Whatever you think is fun! Fun is the goal!

Chip
 
Last edited:
Thanks for the information. I am definitely learning about all of this as I go. Like you say this is all fun, but at the same time I would like to try and get this to work with the least amount of frustration as possible.

Now I have total control over the bandwidth I want to look at so would decreasing the bandwidth help my resolution? I see frequencies of up to around 1000hz at gain levels higher than ambient levels of noise. Could I use this to help get better resolution?

Also could I oversample like planned and then downsample to help my resolution? Or could I look to use the audio library and go with 16 bit resolution at 44.1khz and upsample for better resolution?
 
For any reasonable signal-to-noise ratio, you're not going to do better than 1/2 to 1/4 of a wavelength

I would want more waveforms, to get a better (longer) fit. I'd also want the trigger sound to be of higher frequency.
You might be able to use a bandpass filter with a lower cut-off at 200 Hz, and a higher cut-off at something higher, like 2 kHz or more.
Then sample with multiple cycles of the 200 Hz fundamental to get the best chance of a good match.

The most obvious failure case when you only sample one cycle is that you get full-cycle phase reversal -- pick up on the wrong matching peak. At 200 Hz, that's about 5 feet between the microphones, so maybe your system is fairly immune to that at 1.5 feet, but I wouldn't push it :)
 
Thanks for the information. I am definitely learning about all of this as I go. Like you say this is all fun, but at the same time I would like to try and get this to work with the least amount of frustration as possible.

Now I have total control over the bandwidth I want to look at so would decreasing the bandwidth help my resolution? I see frequencies of up to around 1000hz at gain levels higher than ambient levels of noise. Could I use this to help get better resolution?

Also could I oversample like planned and then downsample to help my resolution? Or could I look to use the audio library and go with 16 bit resolution at 44.1khz and upsample for better resolution?

Signals with a wider bandwidth will definitely help improve your resolution. But it might make detection harder (as a human trying to see what's going on while debugging).

What would really help is starting simple. Get two mics recording audio. Make a sound. A good and loud one (relative to background noise). Once you have the two recordings, try to measure the time difference between the two signals. Use whatever method you'd like. Actually, try a couple methods. If the signals are loud, the method isn't important, which is why everything you try will probably work. How fun is that! What is important about this simple trial is that you got something working end-to-end. You'll have gotten real experience. You'll quickly find out what is easy and what is hard (for you).

I find that it's way easier (less frustrating and with greater chance of success) to tweak and refine a working system than it is to jump straight from nothing to some hypothetical dreamy high-performance end goal. Baby steps usually work better than a big bang.
 
In my opinion, you absolutely want to use the audio library for this. 16 bits is going to be much more important than oversampling, assuming that you can get a low-noise electronics implementation to match. (microphones, wires, power supply, etc.)

Some simple math:
- distance: 450 mm between microphones
- sound speed: 340000 mm/s
- sampling rate: 44100 Hz
- sound speed per sample: 7.7 mm
- phase difference between microphones: 58.4 samples

You can of course fit the phase difference better than by a single sample. The more bit resolution you have, the better, for that. (But there will of course be the problem of background noise.)

If you want higher sampling rate, use audio codecs with 96 kHz or 192 kHz support, and re-compile the audio library with higher sampling rate.
 
Yes I almost have the audio board in so when I get that I will likely try both situations. I currently have a pretty good baseline, but will have to mess around with the bandpass filter,sampling rates, and total samples to see which tend to help more or less.

One thing I am not fully understanding is how to fit a phase difference better than by a single sample. I guess just due to the discrete part of this, how would I code that to get a delay more accurate than a sample?

Also, I've read that the teensy 3s do not usually produce true 16 bit data and its more like 13 bits. Has that been fixed with the 4 or the audio library?
 
If you want high-quality ADCs for audio signals, you do not use anything inside the Teensy (and the Teensy 4 ADCs are worse than T3.x; as you'd expect from the noise coupling from higher-current high frequency switching going on all over the T4.x chips). Using an external ADC makes it much easier to get good quality SNR values.
 
The built-in ADCs on the microcontroller are not even 13 bits, unless your analog solution and buffering is quite excellent.
Meanwhile, the ADCs built into the audio board (and other audio systems) are of a totally different class, and generally have very high resolution, as long as the rest of the solution is of good quality. Audio Codec != built-in codec.

Regarding fitting a phase difference, you can do one of two things:

A: Once you have the match, walk from a known point to the next zero crossing location -- where one sample is > 0, and the next is < 0 (or vice versa.)
Do this on the source, and on the destination. Calculate the fractional sample position where the signal crosses zero by drawing a line between the samples. This will give you the fractional position of that zero crossing, which you can then subtract target from source.

B: Do convolution on data that is a filtered-accessor of the input data. E g, when it wants to read position X, apply a fractional offset to X. Reconstruct the value you return using some kind of interpolation. A third-order (cubic) polynomial will give you better than 60 dB signal/noise in the interpolation (and in some cases, much better.) An alternative is to run a mini-convolution of some number of taps of a FIR filter of a sinc() function to return the value. This can easily get you to > 90 dB signal/noise reconstruction. Use binary search to find the best fractional sample offset. (You can accelerate this by pre-computing the fractionally-delayed buffer once for each iteration, rather than literally running the interpolation across the source samples each time the convolution wants to read a sample.)
 
Any update on this project @tootsie? It sounds like something I am interested in doing and I would love to know if you got it to work!
 
Finally, the way to find sound is to do an autoconvolution between the two channels ...
I think you mean correlation, not convolution (very different things), and secondly its cross-correlation,
an autocorrelation cannot be between two signals!
 
I have had a background project like this for a while now (ie, lots of thinking, not much doing). One question that comes up, why use ADC at all? Why not a simple envelope detector (like here) for each microphone into a comparator and then an interrupt to start/sample a timer? If the actual frequency isn't important, this seems like a much simpler and direct way to go. Am I missing something?
 
I have had a background project like this for a while now (ie, lots of thinking, not much doing). One question that comes up, why use ADC at all? Why not a simple envelope detector (like here) for each microphone into a comparator and then an interrupt to start/sample a timer? If the actual frequency isn't important, this seems like a much simpler and direct way to go. Am I missing something?

Yes I think so, the risetime of the signal is not instantaneous, and the thresholding will likely be different for each microphone
due to signal level variation. Cross-correlation uses all the information in the signal to determine the best estimate of delay,
whatever the risetime. It will be fairly insensitive to the step-response of the microphones too, assuming they are the same.

Cross correlation will work well with lots of noise present too, as the noise averages out and the signal adds coherently.
 
Status
Not open for further replies.
Back
Top