How to replace NaNs by preceding or next values in pandas DataFrame?
You could use the fillna
method on the DataFrame and specify the method as ffill
(forward fill):
>>> df = pd.DataFrame([[1, 2, 3], [4, None, None], [None, None, 9]])>>> df.fillna(method='ffill') 0 1 20 1 2 31 4 2 32 4 2 9
This method...
propagate[s] last valid observation forward to next valid
To go the opposite way, there's also a bfill
method.
This method doesn't modify the DataFrame inplace - you'll need to rebind the returned DataFrame to a variable or else specify inplace=True
:
df.fillna(method='ffill', inplace=True)
The accepted answer is perfect. I had a related but slightly different situation where I had to fill in forward but only within groups. In case someone has the same need, know that fillna works on a DataFrameGroupBy object.
>>> example = pd.DataFrame({'number':[0,1,2,nan,4,nan,6,7,8,9],'name':list('aaabbbcccc')})>>> example name number0 a 0.01 a 1.02 a 2.03 b NaN4 b 4.05 b NaN6 c 6.07 c 7.08 c 8.09 c 9.0>>> example.groupby('name')['number'].fillna(method='ffill') # fill in row 5 but not row 30 0.01 1.02 2.03 NaN4 4.05 4.06 6.07 7.08 8.09 9.0Name: number, dtype: float64
You can use pandas.DataFrame.fillna
with the method='ffill'
option. 'ffill'
stands for 'forward fill' and will propagate last valid observation forward. The alternative is 'bfill'
which works the same way, but backwards.
import pandas as pddf = pd.DataFrame([[1, 2, 3], [4, None, None], [None, None, 9]])df = df.fillna(method='ffill')print(df)# 0 1 2#0 1 2 3#1 4 2 3#2 4 2 9
There is also a direct synonym function for this, pandas.DataFrame.ffill
, to make things simpler.