Dropping infinite values from dataframes in pandas?
The simplest way would be to first replace()
infs to NaN:
df.replace([np.inf, -np.inf], np.nan, inplace=True)
and then use the dropna()
:
df.replace([np.inf, -np.inf], np.nan, inplace=True) \ .dropna(subset=["col1", "col2"], how="all")
For example:
In [11]: df = pd.DataFrame([1, 2, np.inf, -np.inf])In [12]: df.replace([np.inf, -np.inf], np.nan, inplace=True)Out[12]: 00 11 22 NaN3 NaN
The same method would work for a Series.
With option context, this is possible without permanently setting use_inf_as_na
. For example:
with pd.option_context('mode.use_inf_as_na', True): df = df.dropna(subset=['col1', 'col2'], how='all')
Of course it can be set to treat inf
as NaN
permanently with
pd.set_option('use_inf_as_na', True)
For older versions, replace use_inf_as_na
with use_inf_as_null
.
Here is another method using .loc
to replace inf with nan on a Series:
s.loc[(~np.isfinite(s)) & s.notnull()] = np.nan
So, in response to the original question:
df = pd.DataFrame(np.ones((3, 3)), columns=list('ABC'))for i in range(3): df.iat[i, i] = np.infdf A B C0 inf 1.000000 1.0000001 1.000000 inf 1.0000002 1.000000 1.000000 infdf.sum()A infB infC infdtype: float64df.apply(lambda s: s[np.isfinite(s)].dropna()).sum()A 2B 2C 2dtype: float64