How to smooth a curve in the right way?
I prefer a Savitzky-Golay filter. It uses least squares to regress a small window of your data onto a polynomial, then uses the polynomial to estimate the point in the center of the window. Finally the window is shifted forward by one data point and the process repeats. This continues until every point has been optimally adjusted relative to its neighbors. It works great even with noisy samples from non-periodic and non-linear sources.
Here is a thorough cookbook example. See my code below to get an idea of how easy it is to use. Note: I left out the code for defining the
savitzky_golay() function because you can literally copy/paste it from the cookbook example I linked above.
import numpy as npimport matplotlib.pyplot as pltx = np.linspace(0,2*np.pi,100)y = np.sin(x) + np.random.random(100) * 0.2yhat = savitzky_golay(y, 51, 3) # window size 51, polynomial order 3plt.plot(x,y)plt.plot(x,yhat, color='red')plt.show()
UPDATE: It has come to my attention that the cookbook example I linked to has been taken down. Fortunately, the Savitzky-Golay filter has been incorporated into the SciPy library, as pointed out by @dodohjk (thanks @bicarlsen for the updated link).To adapt the above code by using SciPy source, type:
from scipy.signal import savgol_filteryhat = savgol_filter(y, 51, 3) # window size 51, polynomial order 3
EDIT: look at this answer. Using
np.cumsum is much faster than
A quick and dirty way to smooth data I use, based on a moving average box (by convolution):
x = np.linspace(0,2*np.pi,100)y = np.sin(x) + np.random.random(100) * 0.8def smooth(y, box_pts): box = np.ones(box_pts)/box_pts y_smooth = np.convolve(y, box, mode='same') return y_smoothplot(x, y,'o')plot(x, smooth(y,3), 'r-', lw=2)plot(x, smooth(y,19), 'g-', lw=2)
If you are interested in a "smooth" version of a signal that is periodic (like your example), then a FFT is the right way to go. Take the fourier transform and subtract out the low-contributing frequencies:
import numpy as npimport scipy.fftpackN = 100x = np.linspace(0,2*np.pi,N)y = np.sin(x) + np.random.random(N) * 0.2w = scipy.fftpack.rfft(y)f = scipy.fftpack.rfftfreq(N, x-x)spectrum = w**2cutoff_idx = spectrum < (spectrum.max()/5)w2 = w.copy()w2[cutoff_idx] = 0y2 = scipy.fftpack.irfft(w2)
Even if your signal is not completely periodic, this will do a great job of subtracting out white noise. There a many types of filters to use (high-pass, low-pass, etc...), the appropriate one is dependent on what you are looking for.