Comparing pandas Series for equality when they contain nan? Comparing pandas Series for equality when they contain nan? numpy numpy

Comparing pandas Series for equality when they contain nan?


How about this. First check the NaNs are in the same place (using isnull):

In [11]: s1.isnull()Out[11]: 0    False1     Truedtype: boolIn [12]: s1.isnull() == s2.isnull()Out[12]: 0    True1    Truedtype: bool

Then check the values which aren't NaN are equal (using notnull):

In [13]: s1[s1.notnull()]Out[13]: 0    1dtype: float64In [14]: s1[s1.notnull()] == s2[s2.notnull()]Out[14]: 0    Truedtype: bool

In order to be equal we need both to be True:

In [15]: (s1.isnull() == s2.isnull()).all() and (s1[s1.notnull()] == s2[s2.notnull()]).all()Out[15]: True

You could also check name etc. if this wasn't sufficient.

If you want to raise if they are different, use assert_series_equal from pandas.util.testing:

In [21]: from pandas.util.testing import assert_series_equalIn [22]: assert_series_equal(s1, s2)


Currently one should just use series1.equals(series2) see docs. This also checks if nans are in the same positions.


In [16]: s1 = Series([1,np.nan])In [17]: s2 = Series([1,np.nan])In [18]: (s1.dropna()==s2.dropna()).all()Out[18]: True