pandas scatter plotting datetime pandas scatter plotting datetime python-3.x python-3.x

pandas scatter plotting datetime


Not a real answer but a workaround, as suggested by Tom Augspurger, is that you can just use the working line plot type and specify dots instead of lines:

df.plot(x='x', y='y', style=".")


building on Mike N's answer...convert to unix time to scatter properly, then transform your axis labels back from int64s to strings:

type(df.ts1[0])

pandas.tslib.Timestamp

df['t1'] = df.ts1.astype(np.int64)df['t2'] = df.ts2.astype(np.int64)fig, ax = plt.subplots(figsize=(10,6))df.plot(x='t1', y='t2', kind='scatter', ax=ax)ax.set_xticklabels([datetime.fromtimestamp(ts / 1e9).strftime('%H:%M:%S') for ts in ax.get_xticks()])ax.set_yticklabels([datetime.fromtimestamp(ts / 1e9).strftime('%H:%M:%S') for ts in ax.get_yticks()])plt.show()

enter image description here


Not an answer, but I can't edit the question or put this much in a comment, I think.

Here is a reproducible example:

from datetime import datetimeimport pandas as pddf = pd.DataFrame({'x': [datetime.now() for _ in range(10)], 'y': range(10)})df.plot(x='x', y='y', kind='scatter')

This gives KeyError: 'x'.

Interestingly, you do get a plot with just df.plot(x='x', y='y'); it chooses poorly for the default x range because the times are just nanoseconds apart, which is weird, but that's a separate issue. It seems like if you can make a line graph, you should be able to make a scatterplot too.

There is a pandas github issue about this problem, but it was closed for some reason. I'm going to go comment there and see if we can re-start that conversation.

Is there some clever work-around for this? If so, what?