Convert timedelta64[ns] column to seconds in Python Pandas DataFrame Convert timedelta64[ns] column to seconds in Python Pandas DataFrame numpy numpy

Convert timedelta64[ns] column to seconds in Python Pandas DataFrame


This works properly in the current version of Pandas (version 0.14):

In [132]: df[:5]['duration'] / np.timedelta64(1, 's')Out[132]: 0    12321    13902    14953     7974    1132Name: duration, dtype: float64

Here is a workaround for older versions of Pandas/NumPy:

In [131]: df[:5]['duration'].values.view('<i8')/10**9Out[131]: array([1232, 1390, 1495,  797, 1132], dtype=int64)

timedelta64 and datetime64 data are stored internally as 8-byte ints (dtype'<i8'). So the above views the timedelta64s as 8-byte ints and then does integerdivision to convert nanoseconds to seconds.

Note that you need NumPy version 1.7 or newer to work with datetime64/timedelta64s.


Use the Series dt accessor to get access to the methods and attributes of a datetime (timedelta) series.

>>> s0   -1 days +23:45:14.3040001   -1 days +23:46:57.1320002   -1 days +23:49:25.9130003   -1 days +23:59:48.9130004            00:00:00.820000dtype: timedelta64[ns]>>>>>> s.dt.total_seconds()0   -885.6961   -782.8682   -634.0873    -11.0874      0.820dtype: float64

There are other Pandas Series Accessors for String, Categorical, and Sparse data types.


Just realized it's an old thread, anyway leaving it here if wanderers like me clicks only on top 5 results on the search engine and ends up here.

Make sure that your types are correct.

  • If you want to convert datetime to seconds , just sum up seconds for each hour, minute and seconds of the datetime object if its for duration within one date.

      • hours - hours x 3600 = seconds
      • minutes - minutes x 60 = seconds
      • seconds - seconds

linear_df['duration'].dt.hour*3600 + linear_df['duration'].dt.minute*60 + linear_df['duration'].dt.second

  • If you want to convert timedelta to seconds use the one bellow.

linear_df[:5]['duration'].astype('timedelta64[s]')

I got it to work like this:

start_dt and end_dt columns are in this format:

import datetimelinear_df[:5]['start_dt']0   1970-02-22 21:32:48.0001   2016-12-30 17:47:33.2162   2016-12-31 09:33:27.9313   2016-12-31 09:52:53.4864   2016-12-31 10:29:44.611Name: start_dt, dtype: datetime64[ns]

Had my duration in timedelta64[ns] format, which was subtraction of start and end datetime values.

linear_df['duration'] = linear_df['end_dt'] - linear_df['start_dt']

Resulted duration column look like this

linear_df[:5]['duration']0          0 days 00:00:141   2 days 17:44:50.5580002   0 days 15:37:28.4180003   0 days 18:45:45.7270004   0 days 19:21:27.159000Name: duration, dtype: timedelta64[ns]

Using pandas I had my duration seconds between two dates in float. Easier to compare or filter your duration afterwards.

linear_df[:5]['duration'].astype('timedelta64[s]')0        14.01    236690.02     56248.03     67545.04     69687.0Name: duration, dtype: float64

In my case if I want to get all duration which is more than 1 second.

Hope it helps.