Counting frequency of values by date using pandas Counting frequency of values by date using pandas pandas pandas

Counting frequency of values by date using pandas


It might be easiest to turn your Series into a DataFrame and use Pandas' groupby functionality (if you already have a DataFrame then skip straight to adding another column below).

If your Series is called s, then turn it into a DataFrame like so:

>>> df = pd.DataFrame({'Timestamp': s.index, 'Category': s.values})>>> df       Category           Timestamp0      Facebook 2014-10-16 15:05:171         Vimeo 2014-10-16 14:56:372      Facebook 2014-10-16 14:25:16...

Now add another column for the week and year (one way is to use apply and generate a string of the week/year numbers):

>>> df['Week/Year'] = df['Timestamp'].apply(lambda x: "%d/%d" % (x.week, x.year))>>> df             Timestamp     Category Week/Year0  2014-10-16 15:05:17     Facebook   42/20141  2014-10-16 14:56:37        Vimeo   42/20142  2014-10-16 14:25:16     Facebook   42/2014...

Finally, group by 'Week/Year' and 'Category' and aggregate with size() to get the counts. For the data in your question this produces the following:

>>> df.groupby(['Week/Year', 'Category']).size()Week/Year  Category   41/2014    DailyMotion    1           Facebook       3           Vimeo          2           Youtube        342/2014    Facebook       7           Orkut          1           Vimeo          1


To be a little bit more clear, you do not need to create a new column called 'week_num' first.

df.groupby(by=lambda x: "%d/%d" % (x.week(), x.year())).Category.value_counts()

The function by will automatically call on each timestamp object of the index to convert them to week and year, and then group by the week and year.


Convert your TimeStamp column to week number then groupby that week number and value_count the categorical variable like so:

df.groupby('week_num').Category.value_counts()

Where I have assumed that a new column week_num was created from the TimeStamp column.