Pandas 'count(distinct)' equivalent Pandas 'count(distinct)' equivalent python python

Pandas 'count(distinct)' equivalent


I believe this is what you want:

table.groupby('YEARMONTH').CLIENTCODE.nunique()

Example:

In [2]: tableOut[2]:    CLIENTCODE  YEARMONTH0           1     2013011           1     2013012           2     2013013           1     2013024           2     2013025           2     2013026           3     201302In [3]: table.groupby('YEARMONTH').CLIENTCODE.nunique()Out[3]: YEARMONTH201301       2201302       3


Here is another method and it is much simpler. Let’s say your dataframe name is daat and the column name is YEARMONTH:

daat.YEARMONTH.value_counts()


Interestingly enough, very often len(unique()) is a few times (3x-15x) faster than nunique().