Interpolate NaN values in a numpy array Interpolate NaN values in a numpy array numpy numpy

Interpolate NaN values in a numpy array


Lets define first a simple helper function in order to make it more straightforward to handle indices and logical indices of NaNs:

import numpy as npdef nan_helper(y):    """Helper to handle indices and logical indices of NaNs.    Input:        - y, 1d numpy array with possible NaNs    Output:        - nans, logical indices of NaNs        - index, a function, with signature indices= index(logical_indices),          to convert logical indices of NaNs to 'equivalent' indices    Example:        >>> # linear interpolation of NaNs        >>> nans, x= nan_helper(y)        >>> y[nans]= np.interp(x(nans), x(~nans), y[~nans])    """    return np.isnan(y), lambda z: z.nonzero()[0]

Now the nan_helper(.) can now be utilized like:

>>> y= array([1, 1, 1, NaN, NaN, 2, 2, NaN, 0])>>>>>> nans, x= nan_helper(y)>>> y[nans]= np.interp(x(nans), x(~nans), y[~nans])>>>>>> print y.round(2)[ 1.    1.    1.    1.33  1.67  2.    2.    1.    0.  ]

---
Although it may seem first a little bit overkill to specify a separate function to do just things like this:

>>> nans, x= np.isnan(y), lambda z: z.nonzero()[0]

it will eventually pay dividends.

So, whenever you are working with NaNs related data, just encapsulate all the (new NaN related) functionality needed, under some specific helper function(s). Your code base will be more coherent and readable, because it follows easily understandable idioms.

Interpolation, indeed, is a nice context to see how NaN handling is done, but similar techniques are utilized in various other contexts as well.


I came up with this code:

import numpy as npnan = np.nanA = np.array([1, nan, nan, 2, 2, nan, 0])ok = -np.isnan(A)xp = ok.ravel().nonzero()[0]fp = A[-np.isnan(A)]x  = np.isnan(A).ravel().nonzero()[0]A[np.isnan(A)] = np.interp(x, xp, fp)print A

It prints

 [ 1.          1.33333333  1.66666667  2.          2.          1.          0.        ]


Just use numpy logical and there where statement to apply a 1D interpolation.

import numpy as npfrom scipy import interpolatedef fill_nan(A):    '''    interpolate to fill nan values    '''    inds = np.arange(A.shape[0])    good = np.where(np.isfinite(A))    f = interpolate.interp1d(inds[good], A[good],bounds_error=False)    B = np.where(np.isfinite(A),A,f(inds))    return B