Keep only date part when using pandas.to_datetime
0.15.0 this can now be easily done using
.dt to access just the date component:
df['just_date'] = df['dates'].dt.date
The above returns a
datetime.date dtype, if you want to have a
datetime64 then you can just
normalize the time component to midnight so it sets all the values to
df['normalised_date'] = df['dates'].dt.normalize()
This keeps the
datetime64, but the display shows just the
While I upvoted EdChum's answer, which is the most direct answer to the question the OP posed, it does not really solve the performance problem (it still relies on python
datetime objects, and hence any operation on them will be not vectorized - that is, it will be slow).
A better performing alternative is to use
df['dates'].dt.floor('d'). Strictly speaking, it does not "keep only date part", since it just sets the time to
00:00:00. But it does work as desired by the OP when, for instance:
- printing to screen
- saving to csv
- using the column to
... and it is much more efficient, since the operation is vectorized.
EDIT: in fact, the answer the OP's would have preferred is probably "recent versions of
pandas do not write the time to csv if it is
00:00:00 for all observations".