Pandas bar plot changes date format
The plotting code assumes that each bar in a bar plot deserves its own label.You could override this assumption by specifying your own formatter:
ax.xaxis.set_major_formatter(formatter)
The pandas.tseries.converter.TimeSeries_DateFormatter
that Pandas uses toformat the dates in the "good" plot works well with line plots when thex-values are dates. However, with a bar plot the x-values (at least thosereceived by TimeSeries_DateFormatter.__call__
) are merely integers startingat zero. If you try to use TimeSeries_DateFormatter
with a bar plot, all the labels thus start at the Epoch, 1970-1-1 UTC, since this is the date which corresponds to zero. So the formatter used for line plots is unfortunately useless for barplots (at least as far as I can see).
The easiest way I see to produce the desired formatting is to generate and set the labels explicitly:
import numpy as npimport matplotlib.pyplot as pltimport pandas as pdimport matplotlib.ticker as tickerstart = pd.to_datetime("5-1-2012")idx = pd.date_range(start, periods= 365)df = pd.DataFrame({'A':np.random.random(365), 'B':np.random.random(365)})df.index = idxdf_ts = df.resample('W', how= 'max')ax = df_ts.plot(kind='bar', x=df_ts.index, stacked=True)# Make most of the ticklabels empty so the labels don't get too crowdedticklabels = ['']*len(df_ts.index)# Every 4th ticklable shows the month and dayticklabels[::4] = [item.strftime('%b %d') for item in df_ts.index[::4]]# Every 12th ticklabel includes the yearticklabels[::12] = [item.strftime('%b %d\n%Y') for item in df_ts.index[::12]]ax.xaxis.set_major_formatter(ticker.FixedFormatter(ticklabels))plt.gcf().autofmt_xdate()plt.show()
For those looking for a simple example of a bar plot with dates:
import numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport matplotlib.ticker as mtickerdates = pd.date_range('2012-1-1', '2017-1-1', freq='M')df = pd.DataFrame({'A':np.random.random(len(dates)), 'Date':dates})fig, ax = plt.subplots()df.plot.bar(x='Date', y='A', ax=ax)ticklabels = ['']*len(df)skip = len(df)//12ticklabels[::skip] = df['Date'].iloc[::skip].dt.strftime('%Y-%m-%d')ax.xaxis.set_major_formatter(mticker.FixedFormatter(ticklabels))fig.autofmt_xdate()# fixes the tracker# https://matplotlib.org/users/recipes.htmldef fmt(x, pos=0, max_i=len(ticklabels)-1): i = int(x) i = 0 if i < 0 else max_i if i > max_i else i return dates[i]ax.fmt_xdata = fmtplt.show()
I've struggled with this problem too, and after reading several posts came up with the following solution, which seems to me slightly clearer than matplotlib.dates
approach.
Labels without modification:
# Use DatetimeIndex instead of date_range for pandas earlier than 1.0.0 version timeline = pd.date_range(start='2018, November', freq='M', periods=15)df = pd.DataFrame({'date': timeline, 'value': np.random.randn(15)})df.set_index('date', inplace=True)df.plot(kind='bar', figsize=(12, 8), color='#2ecc71')
Labels with modification:
def line_format(label): """ Convert time label to the format of pandas line plot """ month = label.month_name()[:3] if month == 'Jan': month += f'\n{label.year}' return month# Note that we specify rot hereax = df.plot(kind='bar', figsize=(12, 8), color='#2ecc71', rot=0)ax.set_xticklabels(map(line_format, df.index))
This approach will add year to the label only if it is January
Here's an easy approach with pandas plot()
and without using matplotlib dates
:
import pandas as pdimport matplotlib.pyplot as pltimport numpy as np# generate sample datastart = pd.to_datetime("1-1-2012")index = pd.date_range(start, periods= 365)df = pd.DataFrame({'A' : np.random.random(365), 'B' : np.random.random(365)}, index=index)# resample to any timeframe you need, e.g. monthsdf_months = df.resample("M").sum()# plotfig, ax = plt.subplots()df_months.plot(kind="bar", figsize=(16,5), stacked=True, ax=ax)# format xtick-labels with list comprehensionax.set_xticklabels([x.strftime("%Y-%m") for x in df_months.index], rotation=45)plt.show()