Song identification from PC into the line-in jack

Status
Not open for further replies.
Hello, I'm a novice with electrical signals and audio processing, but I have a really interesting idea that I'd like some insight into.

The broad use case is such:
I'd like to connect the 1/8" jack from my headphones to my teensy/audioboard and do some processing on the audio received to see if I can identify what song is being played.
This boils down into 2 specific cases:

1. Learning: I'd like to look at the input, and learn as much about the song being played (Say, BPM or # of peaks seen over a period of time, or duration of the signal). I would store this information on, say an SD card.
2. Identification: Then for any arbitrary audio input, I'd like to try and identify which song is being piped into the teensy.

Additional information:
The number of songs I need to recognize is small < 10

My questions are:
1. Can I safely connect the headphone jack to the line in of the teensy audio board
2. Is there a better way to do this / is this crazy (namely, am I using the wrong parts or is the idea ill-conceived)

Thanks again, I can't wait to see if this is something I can accomplish :)
 
Headphone to line in can work, but you need to be careful with the maximum volume and what the earth potential is. Normal PC design puts the shield at earth potential but not impossible to find something more exotic going on. Simple check is to measure the voltage between the USB ground and the headset shield and confirm as zero volts, assuming plan is to power the Teensy from the PC USB.

In terms of song recognition, 10 is probably possible but it will pretty basic/low reliability. Simple option may be to look at the amplitude monitoring example, break audio into chunks of X time, log the Amp values into average/max/min values and each cycle record values for that chunk to a FIFO array. Compare the array each cycle with a precomputed array looking for a match. Other smarter options involving FFT may work but will ask more of your programing ability to achieve in limited RAM without floating point.
 
Hi! Thanks so much for your response.

Some follow ups:
The teensy is going to be powered by rechargable battery, something off the shelf, like a jackery bolt. Would this pose any additional issues?

Also, how would one mitigate problematic maximum volume/earth potential

Fft seems like a nice solution, but to start I'd probably just try singing basic and make sure I don't fry any electronics in the process &#55357;&#56842;
 
a cool thing will be connecting it online, make some kind of song fingerprint and send it to some recognizing software with API, but I don't know how much of that is actually possible...
 
Powering from a USB battery bank will actually simplfiy things since it can float. Switching noise can be a potential problem but this is not intended to be a data collection device or anything where the last significant digit matters. You may also find smarter powerbanks believe your circuit is not drawing enough current and shut down to save power.
 
I wonder why one would make that effort. Applications like Shazam which do exactly that exist for a long time...

Its possible you are just replying to the shazam-like use case proposed by lorenzofattori :)

But for me this is important because:
I practice at a dance studio where we learn a choreography- and we often need to tape the entire practice and cut portions of the video where we actually run the choreography to music.
Finding the cuts is difficult and time consuming. Possible solutions:

1. Asking someone to push a button (or note the time) takes additional cycles from that aren't readily available (i.e everyone is 'on' doing something else and this is just one more thing that someone has to baby sit)
2. I could just write this on an Java/Android stack and have a phone passively listen and then do some analysis on the stream, but this solution has all about 0 teensy-ness involved (i.e zero fun)
3. The proposed solution is to split the stereo jack have one feed into the teensy audio board and the second into the speaker system. I can then use the teensy+ audio board to do some analysis. So much to learn here + theres a teensy involved and thats super fun. The camera we use to tape has a wifi trigger capability, i could even have the teensy via esp8266 trigger the camera, and cut choreo runs automatically. How cool would that be! :D :D

The desire to discern a song comes from the fact that this studio has up to 8 choreographies running simultaneously and some practices we run multiple songs and i want to be able to tag the video by the song playing.
 
Powering from a USB battery bank will actually simplfiy things since it can float. Switching noise can be a potential problem but this is not intended to be a data collection device or anything where the last significant digit matters. You may also find smarter powerbanks believe your circuit is not drawing enough current and shut down to save power.

@gremlinwrangler Thanks again for your insight- i'm getting my 1/8th audio jacks in the mail today and am going to solder everything together. Hopefully I don't fry anything by going over the maximum volume! (any thoughts on how to attenuate this would be great :D )
 
Re attuation boring method is to just adjust volume down below 1V peak to peak and leave it there. My main concern is that headset out these days can be powerful enough to run small speakers wound all the way up and that is not quite what a line in wants. Adjusted down and then brought up until you get a good signal without clipping or other artifacts should work. If you do not have control over the volume setting then there may need to be some design thought about constraining things. Variable volume may also complicate your detection process.

Looking at your actual use case thoughts are that FFT looking for cyclic beats and a hand selected frequncy may work better than pure amplitude matching since I think you are not so much looking for 'song X playing' as 'a song at beat rate y is playing'.

Other options to look for if this gets complex are light level, distance/gesture (wave near the thing to trigger) or using the Teensy to actually start the music running via USB keyboard or direct bit bashing a button.
 
Status
Not open for further replies.
Back
Top