ValueError: Cannot cast DatetimeIndex to dtype datetime64[us] ValueError: Cannot cast DatetimeIndex to dtype datetime64[us] postgresql postgresql

ValueError: Cannot cast DatetimeIndex to dtype datetime64[us]


Using pd.to_datetime() on each element worked. Option 4, which doesn't work, applies pd.to_datetime() to the entire series. Perhaps the Postgres driver understands python datetime, but not datetime64 in pandas & numpy. Option 4 produced the correct output, but I got ValueError (see title) when sending the DF to Postgres

timesAsPyDt = (spy0030Df['dt']).apply(lambda d: pd.to_datetime(str(d)))


I had the same problem and applying pd.to_datetime() on each element worked as well. But it is orders of magnitude slower than running pd.to_datetime() on the entire series. For a dataframe with over a 1 million rows:

(df['Time']).apply(lambda d: pd.to_datetime(str(d)))

takes approximately 70 seconds

and

pd.to_datetime(df['Time'])

takes approximately 0.01 seconds

The actual problem is that timezone information is being included. To remove it:

t = pd.to_datetime(df['Time'])t = t.tz_localize(None)

This should be much faster!


Actually, this was my data frame.

                              Biomass  Fossil Brown coal/Lignite  Fossil Coal-derived gas  Fossil Gas  Fossil Hard coal  Fossil Oil  Geothermal  Hydro Pumped Storage  Hydro Run-of-river and poundage  Hydro Water Reservoir  Nuclear   Other  Other renewable    Solar  Waste  Wind Offshore  Wind Onshore2018-02-02 00:00:00+01:00   4835.0                    16275.0                    446.0      1013.0            4071.0       155.0         5.0                   7.0                           1906.0                   35.0   8924.0  3643.0            142.0      0.0  595.0         2517.0       19999.02018-02-02 00:15:00+01:00   4834.0                    16272.0                    446.0      1010.0            3983.0       155.0         5.0                   7.0                           1908.0                   71.0   8996.0  3878.0            142.0      0.0  594.0         2364.0       19854.02018-02-02 00:30:00+01:00   4828.0                    16393.0                    446.0      1019.0            4015.0       155.0         5.0    

I was trying to insert into SQL database but getting the same error as in the above question. What i have done is, convert the index of the data frame to the column with a label 'index'.

df.reset_index(level=0, inplace=True)  

Rename the column name 'index' to 'DateTime' by using this code.

df = df.rename(columns={'index': 'DateTime'})

Change the datatype to the 'datetime64'.

df['DateTime'] = df['DateTime'].astype('datetime64')

Store it in the sql database using these code.

engine = create_engine('mysql+mysqlconnector://root:Password@localhost/generation_data', echo=True)df.to_sql(con=engine, name='test', if_exists='replace')