Calculate Time Difference Between Two Pandas Columns in Hours and Minutes Calculate Time Difference Between Two Pandas Columns in Hours and Minutes python python

Calculate Time Difference Between Two Pandas Columns in Hours and Minutes


Pandas timestamp differences returns a datetime.timedelta object. This can easily be converted into hours by using the *as_type* method, like so

import pandasdf = pandas.DataFrame(columns=['to','fr','ans'])df.to = [pandas.Timestamp('2014-01-24 13:03:12.050000'), pandas.Timestamp('2014-01-27 11:57:18.240000'), pandas.Timestamp('2014-01-23 10:07:47.660000')]df.fr = [pandas.Timestamp('2014-01-26 23:41:21.870000'), pandas.Timestamp('2014-01-27 15:38:22.540000'), pandas.Timestamp('2014-01-23 18:50:41.420000')](df.fr-df.to).astype('timedelta64[h]')

to yield,

0    581     32     8dtype: float64


This was driving me bonkers as the .astype() solution above didn't work for me. But I found another way. Haven't timed it or anything, but might work for others out there:

t1 = pd.to_datetime('1/1/2015 01:00')t2 = pd.to_datetime('1/1/2015 03:30')print pd.Timedelta(t2 - t1).seconds / 3600.0

...if you want hours. Or:

print pd.Timedelta(t2 - t1).seconds / 60.0

...if you want minutes.

UPDATE: There used to be a helpful comment here that mentioned using .total_seconds() for time periods spanning multiple days. Since it's gone, I've updated the answer.


  • How do I convert my results to only hours and minutes
    • The accepted answer only returns days + hours. Minutes are not included.
  • To provide a column that has hours and minutes as hh:mm or x hours y minutes, would require additional calculations and string formatting.
  • This answer shows how to get either total hours or total minutes as a float, using timedelta math, and is faster than using .astype('timedelta64[h]')
  • Pandas Time Deltas User Guide
  • Pandas Time series / date functionality User Guide
  • python timedelta objects: See supported operations.
  • The following sample data is already a datetime64[ns] dtype. It is required that all relevant columns are converted using pandas.to_datetime().
import pandas as pd# test data from OP, with values already in a datetime formatdata = {'to_date': [pd.Timestamp('2014-01-24 13:03:12.050000'), pd.Timestamp('2014-01-27 11:57:18.240000'), pd.Timestamp('2014-01-23 10:07:47.660000')],        'from_date': [pd.Timestamp('2014-01-26 23:41:21.870000'), pd.Timestamp('2014-01-27 15:38:22.540000'), pd.Timestamp('2014-01-23 18:50:41.420000')]}# test dataframe; the columns must be in a datetime format; use pandas.to_datetime if neededdf = pd.DataFrame(data)# add a timedelta column if wanted. It's added here for information only# df['time_delta_with_sub'] = df.from_date.sub(df.to_date)  # also worksdf['time_delta'] = (df.from_date - df.to_date)# create a column with timedelta as total hours, as a float typedf['tot_hour_diff'] = (df.from_date - df.to_date) / pd.Timedelta(hours=1)# create a colume with timedelta as total minutes, as a float typedf['tot_mins_diff'] = (df.from_date - df.to_date) / pd.Timedelta(minutes=1)# display(df)                  to_date               from_date             time_delta  tot_hour_diff  tot_mins_diff0 2014-01-24 13:03:12.050 2014-01-26 23:41:21.870 2 days 10:38:09.820000      58.636061    3518.1636671 2014-01-27 11:57:18.240 2014-01-27 15:38:22.540 0 days 03:41:04.300000       3.684528     221.0716672 2014-01-23 10:07:47.660 2014-01-23 18:50:41.420 0 days 08:42:53.760000       8.714933     522.896000

Other methods

  • An item of note from the podcast in Other Resources, .total_seconds() was added and merged when the core developer was on vacation, and would not have been approved.
    • This is also why there aren't other .total_xx methods.
# convert the entire timedelta to seconds# this is the same as td / timedelta(seconds=1)(df.from_date - df.to_date).dt.total_seconds()[out]:0    211089.821     13264.302     31373.76dtype: float64# get the number of days(df.from_date - df.to_date).dt.days[out]:0    21    02    0dtype: int64# get the seconds for hours + minutes + seconds, but not days# note the difference from total_seconds(df.from_date - df.to_date).dt.seconds[out]:0    382891    132642    31373dtype: int64

Other Resources

%%timeit test

import pandas as pd# dataframe with 2M rowsdata = {'to_date': [pd.Timestamp('2014-01-24 13:03:12.050000'), pd.Timestamp('2014-01-27 11:57:18.240000')], 'from_date': [pd.Timestamp('2014-01-26 23:41:21.870000'), pd.Timestamp('2014-01-27 15:38:22.540000')]}df = pd.DataFrame(data)df = pd.concat([df] * 1000000).reset_index(drop=True)%%timeit(df.from_date - df.to_date) / pd.Timedelta(hours=1)[out]:43.1 ms ± 1.05 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)%%timeit(df.from_date - df.to_date).astype('timedelta64[h]')[out]:59.8 ms ± 1.29 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)