Most efficient way to forward-fill NaN values in numpy array Most efficient way to forward-fill NaN values in numpy array arrays arrays

Most efficient way to forward-fill NaN values in numpy array


Here's one approach -

mask = np.isnan(arr)idx = np.where(~mask,np.arange(mask.shape[1]),0)np.maximum.accumulate(idx,axis=1, out=idx)out = arr[np.arange(idx.shape[0])[:,None], idx]

If you don't want to create another array and just fill the NaNs in arr itself, replace the last step with this -

arr[mask] = arr[np.nonzero(mask)[0], idx[mask]]

Sample input, output -

In [179]: arrOut[179]: array([[  5.,  nan,  nan,   7.,   2.,   6.,   5.],       [  3.,  nan,   1.,   8.,  nan,   5.,  nan],       [  4.,   9.,   6.,  nan,  nan,  nan,   7.]])In [180]: outOut[180]: array([[ 5.,  5.,  5.,  7.,  2.,  6.,  5.],       [ 3.,  3.,  1.,  8.,  8.,  5.,  5.],       [ 4.,  9.,  6.,  6.,  6.,  6.,  7.]])


Use Numba. This should give a significant speedup:

import numba@numba.jitdef loops_fill(arr):    ...


For those that came here looking for the backward-fill of NaN values, I modified the solution provided by Divakar above to do exactly that. The trick is that you have to do the accumulation on the reversed array using the minimum except for the maximum.

Here is the code:

# As provided in the answer by Divakardef ffill(arr):    mask = np.isnan(arr)    idx = np.where(~mask, np.arange(mask.shape[1]), 0)    np.maximum.accumulate(idx, axis=1, out=idx)    out = arr[np.arange(idx.shape[0])[:,None], idx]    return out# My modification to do a backward-filldef bfill(arr):    mask = np.isnan(arr)    idx = np.where(~mask, np.arange(mask.shape[1]), mask.shape[1] - 1)    idx = np.minimum.accumulate(idx[:, ::-1], axis=1)[:, ::-1]    out = arr[np.arange(idx.shape[0])[:,None], idx]    return out# Test both functionsarr = np.array([[5, np.nan, np.nan, 7, 2],                [3, np.nan, 1, 8, np.nan],                [4, 9, 6, np.nan, np.nan]])print('Array:')print(arr)print('\nffill')print(ffill(arr))print('\nbfill')print(bfill(arr))

Output:

Array:[[ 5. nan nan  7.  2.] [ 3. nan  1.  8. nan] [ 4.  9.  6. nan nan]]ffill[[5. 5. 5. 7. 2.] [3. 3. 1. 8. 8.] [4. 9. 6. 6. 6.]]bfill[[ 5.  7.  7.  7.  2.] [ 3.  1.  1.  8. nan] [ 4.  9.  6. nan nan]]

Edit: Update according to comment of MS_