Can Pandas plot a histogram of dates? Can Pandas plot a histogram of dates? pandas pandas

Can Pandas plot a histogram of dates?


Given this df:

        date0 2001-08-101 2002-08-312 2003-08-293 2006-06-214 2002-03-275 2003-07-146 2004-06-157 2003-08-148 2003-07-29

and, if it's not already the case:

df["date"] = df["date"].astype("datetime64")

To show the count of dates by month:

df.groupby(df["date"].dt.month).count().plot(kind="bar")

.dt allows you to access the datetime properties.

Which will give you:

groupby date month

You can replace month by year, day, etc..

If you want to distinguish year and month for instance, just do:

df.groupby([df["date"].dt.year, df["date"].dt.month]).count().plot(kind="bar")

Which gives:

groupby date month year

Was it what you wanted ? Is this clear ?

Hope this helps !


I think resample might be what you are looking for. In your case, do:

df.set_index('date', inplace=True)# for '1M' for 1 month; '1W' for 1 week; check documentation on offset aliasdf.resample('1M').count()

It is only doing the counting and not the plot, so you then have to make your own plots.

See this post for more details on the documentation of resamplepandas resample documentation

I have ran into similar problems as you did. Hope this helps.


Rendered example

enter image description here

Example Code

#!/usr/bin/env python# -*- coding: utf-8 -*-"""Create random datetime object."""# core modulesfrom datetime import datetimeimport random# 3rd party modulesimport pandas as pdimport matplotlib.pyplot as pltdef visualize(df, column_name='start_date', color='#494949', title=''):    """    Visualize a dataframe with a date column.    Parameters    ----------    df : Pandas dataframe    column_name : str        Column to visualize    color : str    title : str    """    plt.figure(figsize=(20, 10))    ax = (df[column_name].groupby(df[column_name].dt.hour)                         .count()).plot(kind="bar", color=color)    ax.set_facecolor('#eeeeee')    ax.set_xlabel("hour of the day")    ax.set_ylabel("count")    ax.set_title(title)    plt.show()def create_random_datetime(from_date, to_date, rand_type='uniform'):    """    Create random date within timeframe.    Parameters    ----------    from_date : datetime object    to_date : datetime object    rand_type : {'uniform'}    Examples    --------    >>> random.seed(28041990)    >>> create_random_datetime(datetime(1990, 4, 28), datetime(2000, 12, 31))    datetime.datetime(1998, 12, 13, 23, 38, 0, 121628)    >>> create_random_datetime(datetime(1990, 4, 28), datetime(2000, 12, 31))    datetime.datetime(2000, 3, 19, 19, 24, 31, 193940)    """    delta = to_date - from_date    if rand_type == 'uniform':        rand = random.random()    else:        raise NotImplementedError('Unknown random mode \'{}\''                                  .format(rand_type))    return from_date + rand * deltadef create_df(n=1000):    """Create a Pandas dataframe with datetime objects."""    from_date = datetime(1990, 4, 28)    to_date = datetime(2000, 12, 31)    sales = [create_random_datetime(from_date, to_date) for _ in range(n)]    df = pd.DataFrame({'start_date': sales})    return dfif __name__ == '__main__':    import doctest    doctest.testmod()    df = create_df()    visualize(df)