Replace NaN's in NumPy array with closest non-NaN value
As an alternate solution (this will linearly interpolate for arrays NaN
s in the middle, as well):
import numpy as np# Generate data...data = np.random.random(10)data[:2] = np.nandata[-1] = np.nandata[4:6] = np.nanprint data# Fill in NaN's...mask = np.isnan(data)data[mask] = np.interp(np.flatnonzero(mask), np.flatnonzero(~mask), data[~mask])print data
This yields:
[ nan nan 0.31619306 0.25818765 nan nan 0.27410025 0.23347532 0.02418698 nan][ 0.31619306 0.31619306 0.31619306 0.25818765 0.26349185 0.26879605 0.27410025 0.23347532 0.02418698 0.02418698]
I want to replace each NaN with the closest non-NaN value... there will be no NaN's in the middle of the numbers
The following will do it:
ind = np.where(~np.isnan(a))[0]first, last = ind[0], ind[-1]a[:first] = a[first]a[last + 1:] = a[last]
This is a straight numpy
solution requiring no Python loops, no recursion, no list comprehensions etc.
NaN
s have the interesting property of comparing different from themselves, thus we can quickly find the indexes of the non-nan elements:
idx = np.nonzero(a==a)[0]
it's now easy to replace the nans with the desired value:
for i in range(0, idx[0]): a[i]=a[idx[0]]for i in range(idx[-1]+1, a.size) a[i]=a[idx[-1]]
Finally, we can put this in a function:
import numpy as npdef FixNaNs(arr): if len(arr.shape)>1: raise Exception("Only 1D arrays are supported.") idxs=np.nonzero(arr==arr)[0] if len(idxs)==0: return None ret=arr for i in range(0, idxs[0]): ret[i]=ret[idxs[0]] for i in range(idxs[-1]+1, ret.size): ret[i]=ret[idxs[-1]] return ret
edit
Ouch, coming from C++ I always forget about list ranges... @aix's solution is way more elegant and efficient than my C++ish loops, use that instead of mine.