Python frequency detection Python frequency detection python python

Python frequency detection


The aubio libraries have been wrapped with SWIG and can thus be used by Python. Among their many features include several methods for pitch detection/estimation including the YIN algorithm and some harmonic comb algorithms.

However, if you want something simpler, I wrote some code for pitch estimation some time ago and you can take it or leave it. It won't be as accurate as using the algorithms in aubio, but it might be good enough for your needs. I basically just took the FFT of the data times a window (a Blackman window in this case), squared the FFT values, found the bin that had the highest value, and used a quadratic interpolation around the peak using the log of the max value and its two neighboring values to find the fundamental frequency. The quadratic interpolation I took from some paper that I found.

It works fairly well on test tones, but it will not be as robust or as accurate as the other methods mentioned above. The accuracy can be increased by increasing the chunk size (or reduced by decreasing it). The chunk size should be a multiple of 2 to make full use of the FFT. Also, I am only determining the fundamental pitch for each chunk with no overlap. I used PyAudio to play the sound through while writing out the estimated pitch.

Source Code:

# Read in a WAV and find the freq'simport pyaudioimport waveimport numpy as npchunk = 2048# open up a wavewf = wave.open('test-tones/440hz.wav', 'rb')swidth = wf.getsampwidth()RATE = wf.getframerate()# use a Blackman windowwindow = np.blackman(chunk)# open streamp = pyaudio.PyAudio()stream = p.open(format =                p.get_format_from_width(wf.getsampwidth()),                channels = wf.getnchannels(),                rate = RATE,                output = True)# read some datadata = wf.readframes(chunk)# play stream and find the frequency of each chunkwhile len(data) == chunk*swidth:    # write data out to the audio stream    stream.write(data)    # unpack the data and times by the hamming window    indata = np.array(wave.struct.unpack("%dh"%(len(data)/swidth),\                                         data))*window    # Take the fft and square each value    fftData=abs(np.fft.rfft(indata))**2    # find the maximum    which = fftData[1:].argmax() + 1    # use quadratic interpolation around the max    if which != len(fftData)-1:        y0,y1,y2 = np.log(fftData[which-1:which+2:])        x1 = (y2 - y0) * .5 / (2 * y1 - y2 - y0)        # find the frequency and output it        thefreq = (which+x1)*RATE/chunk        print "The freq is %f Hz." % (thefreq)    else:        thefreq = which*RATE/chunk        print "The freq is %f Hz." % (thefreq)    # read some more data    data = wf.readframes(chunk)if data:    stream.write(data)stream.close()p.terminate()


If you're going to use FSK (frequency shift keying) for encoding data, you're probably better off using the Goertzel algorithm so you can check just the frequencies you want, instead of a full DFT/FFT.


You can find the frequency spectrum of the sliding windows over your sound from here and then check the presence of the prevalent frequency band via finding the area under the frequency spectrum curve for that band from here.

import numpy as npimport matplotlib.pyplot as pltfrom sklearn.metrics import aucnp.random.seed(0)# Sine sample with a frequency of 5hz and add some noisesr = 32  # sampling ratey = np.linspace(0, 5 * 2*np.pi, sr)y = np.tile(np.sin(y), 5)y += np.random.normal(0, 1, y.shape)t = np.arange(len(y)) / float(sr)# Generate frquency spectrumspectrum, freqs, _ = plt.magnitude_spectrum(y, sr)# Calculate percentage for a frequency range lower_frq, upper_frq = 4, 6ind_band = np.where((freqs > lower_frq) & (freqs < upper_frq))plt.fill_between(freqs[ind_band], spectrum[ind_band], color='red', alpha=0.6)frq_band_perc = auc(freqs[ind_band], spectrum[ind_band]) / auc(freqs, spectrum)print('{:.1%}'.format(frq_band_perc))# 19.8%

enter image description here