Python Pandas write to sql with NaN values Python Pandas write to sql with NaN values pandas pandas

Python Pandas write to sql with NaN values


Update: starting with pandas 0.15, to_sql supports writing NaN values (they will be written as NULL in the database), so the workaround described below should not be needed anymore (see https://github.com/pydata/pandas/pull/8208).
Pandas 0.15 will be released in coming October, and the feature is merged in the development version.


This is probably due to NaN values in your table, and this is a known shortcoming at the moment that the pandas sql functions don't handle NaNs well (https://github.com/pydata/pandas/issues/2754, https://github.com/pydata/pandas/issues/4199)

As a workaround at this moment (for pandas versions 0.14.1 and lower), you can manually convert the nan values to None with:

df2 = df.astype(object).where(pd.notnull(df), None)

and then write the dataframe to sql. This however converts all columns to object dtype. Because of this, you have to create the database table based on the original dataframe. Eg if your first row does not contain NaNs:

df[:1].to_sql('table_name', con)df2[1:].to_sql('table_name', con, if_exists='append')


using the previous solution will change column dtype from float64 to object_.

I have found a better solution, just add the following _write_mysql function:

from pandas.io import sqldef _write_mysql(frame, table, names, cur):    bracketed_names = ['`' + column + '`' for column in names]    col_names = ','.join(bracketed_names)    wildcards = ','.join([r'%s'] * len(names))    insert_query = "INSERT INTO %s (%s) VALUES (%s)" % (        table, col_names, wildcards)    data = [[None if type(y) == float and np.isnan(y) else y for y in x] for x in frame.values]    cur.executemany(insert_query, data)

And then override its implementation in pandas as below:

sql._write_mysql = _write_mysql

With this code, nan values will be saved correctly in the database without altering the column type.