How to replace NaNs by preceding or next values in pandas DataFrame? How to replace NaNs by preceding or next values in pandas DataFrame? python python

How to replace NaNs by preceding or next values in pandas DataFrame?


You could use the fillna method on the DataFrame and specify the method as ffill (forward fill):

>>> df = pd.DataFrame([[1, 2, 3], [4, None, None], [None, None, 9]])>>> df.fillna(method='ffill')   0  1  20  1  2  31  4  2  32  4  2  9

This method...

propagate[s] last valid observation forward to next valid

To go the opposite way, there's also a bfill method.

This method doesn't modify the DataFrame inplace - you'll need to rebind the returned DataFrame to a variable or else specify inplace=True:

df.fillna(method='ffill', inplace=True)


The accepted answer is perfect. I had a related but slightly different situation where I had to fill in forward but only within groups. In case someone has the same need, know that fillna works on a DataFrameGroupBy object.

>>> example = pd.DataFrame({'number':[0,1,2,nan,4,nan,6,7,8,9],'name':list('aaabbbcccc')})>>> example  name  number0    a     0.01    a     1.02    a     2.03    b     NaN4    b     4.05    b     NaN6    c     6.07    c     7.08    c     8.09    c     9.0>>> example.groupby('name')['number'].fillna(method='ffill') # fill in row 5 but not row 30    0.01    1.02    2.03    NaN4    4.05    4.06    6.07    7.08    8.09    9.0Name: number, dtype: float64


You can use pandas.DataFrame.fillna with the method='ffill' option. 'ffill' stands for 'forward fill' and will propagate last valid observation forward. The alternative is 'bfill' which works the same way, but backwards.

import pandas as pddf = pd.DataFrame([[1, 2, 3], [4, None, None], [None, None, 9]])df = df.fillna(method='ffill')print(df)#   0  1  2#0  1  2  3#1  4  2  3#2  4  2  9

There is also a direct synonym function for this, pandas.DataFrame.ffill, to make things simpler.