python equivalent of R table python equivalent of R table python python

python equivalent of R table


Pandas has a built-in function called value_counts().

Example: if your DataFrame has a column with values as 0's and 1's, and you want to count the total frequencies for each of them, then simply use this:

df.colName.value_counts()


A Counter object from the collections library will function like that.

from collections import Counterx = [[12, 6], [12, 0], [0, 6], [12, 0], [12, 0], [6, 0], [12, 6], [0, 6], [12, 0], [0, 6], [0, 6], [12, 0], [0, 6], [6, 0], [6, 0], [12, 0], [6, 0], [12, 0], [12, 0], [0, 6], [0, 6], [12, 6], [6, 0], [6, 0], [12, 6], [12, 0], [12, 0], [0, 6], [6, 0], [12, 6], [12, 6], [12, 6], [12, 0], [12, 0], [12, 0], [12, 0], [12, 6], [12, 0], [12, 0], [12, 6], [0, 6], [0, 6], [6, 0], [12, 6], [12, 6], [12, 6], [12, 6], [12, 6], [12, 0], [0, 6], [6, 0], [12, 0], [0, 6], [12, 6], [12, 6], [0, 6], [12, 0], [6, 0], [6, 0], [12, 6], [12, 0], [0, 6], [12, 0], [12, 0], [12, 0], [6, 0], [12, 6], [12, 6], [12, 6], [12, 6], [0, 6], [12, 0], [12, 6], [0, 6], [0, 6], [12, 0], [0, 6], [12, 6], [6, 0], [12, 6], [12, 6], [12, 0], [12, 0], [12, 6], [0, 6], [6, 0], [12, 0], [6, 0], [12, 0], [12, 0], [12, 6], [12, 0], [6, 0], [12, 6], [6, 0], [12, 0], [6, 0], [12, 0], [6, 0], [6, 0]]# Since the elements passed to a `Counter` must be hashable, we have to change the lists to tuples.x = [tuple(element) for element in x]freq = Counter(x)print freq[(12,6)]# Result:  28


Supposing you need to convert the data to a pandas DataFrame anyway, so that you have

L = [[12, 6], [12, 0], [0, 6], [12, 0], [12, 0], [6, 0], [12, 6], [0, 6], [12, 0], [0, 6], [0, 6], [12, 0], [0, 6], [6, 0], [6, 0], [12, 0], [6, 0], [12, 0], [12, 0], [0, 6], [0, 6], [12, 6], [6, 0], [6, 0], [12, 6], [12, 0], [12, 0], [0, 6], [6, 0], [12, 6], [12, 6], [12, 6], [12, 0], [12, 0], [12, 0], [12, 0], [12, 6], [12, 0], [12, 0], [12, 6], [0, 6], [0, 6], [6, 0], [12, 6], [12, 6], [12, 6], [12, 6], [12, 6], [12, 0], [0, 6], [6, 0], [12, 0], [0, 6], [12, 6], [12, 6], [0, 6], [12, 0], [6, 0], [6, 0], [12, 6], [12, 0], [0, 6], [12, 0], [12, 0], [12, 0], [6, 0], [12, 6], [12, 6], [12, 6], [12, 6], [0, 6], [12, 0], [12, 6], [0, 6], [0, 6], [12, 0], [0, 6], [12, 6], [6, 0], [12, 6], [12, 6], [12, 0], [12, 0], [12, 6], [0, 6], [6, 0], [12, 0], [6, 0], [12, 0], [12, 0], [12, 6], [12, 0], [6, 0], [12, 6], [6, 0], [12, 0], [6, 0], [12, 0], [6, 0], [6, 0]]df = pd.DataFrame(L, columns=('a', 'b'))

then you can do as suggested in this answer, using groupby.size():

tab = df.groupby(['a', 'b']).size()

tab looks as follows:

In [5]: tabOut[5]:a   b0   6    196   0    2012  0    33    6    28dtype: int64

and can easily be changed to a table form with unstack():

In [6]: tab.unstack()Out[6]:b      0     6a0    NaN  19.06   20.0   NaN12  33.0  28.0

Fill NaNs and convert to int at your own leisure!