conditional fill in pandas dataframe conditional fill in pandas dataframe pandas pandas

conditional fill in pandas dataframe


This can be done fairly efficiently with Numba. If you are not able to use Numba, just omit @njit and your logic will run as a Python-level loop.

import numpy as npimport pandas as pdfrom numba import njitnp.random.seed(0)df = pd.DataFrame(1000*(2+np.random.randn(500, 1)), columns=['A'])df.loc[1, 'A'] = np.nandf.loc[15, 'A'] = np.nandf.loc[240, 'A'] = np.nan@njitdef recurse_nb(x):    out = x.copy()    for i in range(1, x.shape[0]):        if not np.isnan(x[i]) and (abs(1 - x[i] / out[i-1]) < 0.3):            out[i] = out[i-1]    return outdf['B'] = recurse_nb(df['A'].values)print(df.head(10))             A            B0  3764.052346  3764.0523461          NaN          NaN2  2978.737984  2978.7379843  4240.893199  4240.8931994  3867.557990  4240.8931995  1022.722120  1022.7221206  2950.088418  2950.0884187  1848.642792  1848.6427928  1896.781148  1848.6427929  2410.598502  2410.598502


Not sure what you want to do with the first B-1 and the dividing by NaN situation:

df = pd.DataFrame([1,2,3,4,5,None,6,7,8,9,10], columns=['A'])b1 = df.A.shift(1)b1[0] = 1b = list(map(lambda a,b1: a if np.isnan(a) else (b1 if abs(b1-a)/b1 < 0.3 else a), df.A, b1 ))df['B'] = bdf       A    B0    1.0  1.01    2.0  2.02    3.0  3.03    4.0  4.04    5.0  4.05    NaN  NaN6    6.0  6.07    7.0  6.08    8.0  7.09    9.0  8.010  10.0  9.0

as per @jpp, you could also do a list comprehension version for list b:

b = [a if np.isnan(a) or abs(b-a)/b >= 0.3 else b for a,b in zip(df.A,b1)]


A simple solution that I could come up with is following. I was wondering if there is more pythonic way of doing things:

 a = df['A'].values b = [] b.append(t[0]) for i in range(1, len(a)):     if np.isnan(a[i]):         b.append(a[i])     else:         b.append(b[i-1] if abs(1 - a[i]/b[i-1]) < 0.3 else a[i]) df['B'] = b