How do I translate an ISO 8601 datetime string into a Python datetime object? [duplicate] How do I translate an ISO 8601 datetime string into a Python datetime object? [duplicate] python python

How do I translate an ISO 8601 datetime string into a Python datetime object? [duplicate]


I prefer using the dateutil library for timezone handling and generally solid date parsing. If you were to get an ISO 8601 string like: 2010-05-08T23:41:54.000Z you'd have a fun time parsing that with strptime, especially if you didn't know up front whether or not the timezone was included. pyiso8601 has a couple of issues (check their tracker) that I ran into during my usage and it hasn't been updated in a few years. dateutil, by contrast, has been active and worked for me:

from dateutil import parseryourdate = parser.parse(datestring)


Since Python 3.7 and no external libraries, you can use the strptime function from the datetime module:

datetime.datetime.strptime('2019-01-04T16:41:24+0200', "%Y-%m-%dT%H:%M:%S%z")

For more formatting options, see here.

Python 2 doesn't support the %z format specifier, so it's best to explicitly use Zulu time everywhere if possible:

datetime.datetime.strptime("2007-03-04T21:08:12Z", "%Y-%m-%dT%H:%M:%SZ")


Because ISO 8601 allows many variations of optional colons and dashes being present, basically CCYY-MM-DDThh:mm:ss[Z|(+|-)hh:mm]. If you want to use strptime, you need to strip out those variations first.

The goal is to generate a UTC datetime object.


If you just want a basic case that work for UTC with the Z suffix like 2016-06-29T19:36:29.3453Z:

datetime.datetime.strptime(timestamp.translate(None, ':-'), "%Y%m%dT%H%M%S.%fZ")

If you want to handle timezone offsets like 2016-06-29T19:36:29.3453-0400 or 2008-09-03T20:56:35.450686+05:00 use the following. These will convert all variations into something without variable delimiters like 20080903T205635.450686+0500 making it more consistent/easier to parse.

import re# This regex removes all colons and all# dashes EXCEPT for the dash indicating + or - utc offset for the timezoneconformed_timestamp = re.sub(r"[:]|([-](?!((\d{2}[:]\d{2})|(\d{4}))$))", '', timestamp)datetime.datetime.strptime(conformed_timestamp, "%Y%m%dT%H%M%S.%f%z" )

If your system does not support the %z strptime directive (you see something like ValueError: 'z' is a bad directive in format '%Y%m%dT%H%M%S.%f%z') then you need to manually offset the time from Z (UTC). Note %z may not work on your system in Python versions < 3 as it depended on the C library support which varies across system/Python build type (i.e., Jython, Cython, etc.).

import reimport datetime# This regex removes all colons and all# dashes EXCEPT for the dash indicating + or - utc offset for the timezoneconformed_timestamp = re.sub(r"[:]|([-](?!((\d{2}[:]\d{2})|(\d{4}))$))", '', timestamp)# Split on the offset to remove it. Use a capture group to keep the delimitersplit_timestamp = re.split(r"([+|-])",conformed_timestamp)main_timestamp = split_timestamp[0]if len(split_timestamp) == 3:    sign = split_timestamp[1]    offset = split_timestamp[2]else:    sign = None    offset = None# Generate the datetime object without the offset at UTC timeoutput_datetime = datetime.datetime.strptime(main_timestamp +"Z", "%Y%m%dT%H%M%S.%fZ" )if offset:    # Create timedelta based on offset    offset_delta = datetime.timedelta(hours=int(sign+offset[:-2]), minutes=int(sign+offset[-2:]))    # Offset datetime with timedelta    output_datetime = output_datetime + offset_delta