Replace NaN's in NumPy array with closest non-NaN value Replace NaN's in NumPy array with closest non-NaN value arrays arrays

Replace NaN's in NumPy array with closest non-NaN value


As an alternate solution (this will linearly interpolate for arrays NaNs in the middle, as well):

import numpy as np# Generate data...data = np.random.random(10)data[:2] = np.nandata[-1] = np.nandata[4:6] = np.nanprint data# Fill in NaN's...mask = np.isnan(data)data[mask] = np.interp(np.flatnonzero(mask), np.flatnonzero(~mask), data[~mask])print data

This yields:

[        nan         nan  0.31619306  0.25818765         nan         nan  0.27410025  0.23347532  0.02418698         nan][ 0.31619306  0.31619306  0.31619306  0.25818765  0.26349185  0.26879605  0.27410025  0.23347532  0.02418698  0.02418698]


I want to replace each NaN with the closest non-NaN value... there will be no NaN's in the middle of the numbers

The following will do it:

ind = np.where(~np.isnan(a))[0]first, last = ind[0], ind[-1]a[:first] = a[first]a[last + 1:] = a[last]

This is a straight numpy solution requiring no Python loops, no recursion, no list comprehensions etc.


NaNs have the interesting property of comparing different from themselves, thus we can quickly find the indexes of the non-nan elements:

idx = np.nonzero(a==a)[0]

it's now easy to replace the nans with the desired value:

for i in range(0, idx[0]):    a[i]=a[idx[0]]for i in range(idx[-1]+1, a.size)    a[i]=a[idx[-1]]

Finally, we can put this in a function:

import numpy as npdef FixNaNs(arr):    if len(arr.shape)>1:        raise Exception("Only 1D arrays are supported.")    idxs=np.nonzero(arr==arr)[0]    if len(idxs)==0:        return None    ret=arr    for i in range(0, idxs[0]):        ret[i]=ret[idxs[0]]    for i in range(idxs[-1]+1, ret.size):        ret[i]=ret[idxs[-1]]    return ret

edit

Ouch, coming from C++ I always forget about list ranges... @aix's solution is way more elegant and efficient than my C++ish loops, use that instead of mine.