How to insert pandas dataframe via mysqldb into database?
Update:
There is now a to_sql
method, which is the preferred way to do this, rather than write_frame
:
df.to_sql(con=con, name='table_name_for_df', if_exists='replace', flavor='mysql')
Also note: the syntax may change in pandas 0.14...
You can set up the connection with MySQLdb:
from pandas.io import sqlimport MySQLdbcon = MySQLdb.connect() # may need to add some other options to connect
Setting the flavor
of write_frame
to 'mysql'
means you can write to mysql:
sql.write_frame(df, con=con, name='table_name_for_df', if_exists='replace', flavor='mysql')
The argument if_exists
tells pandas how to deal if the table already exists:
if_exists: {'fail', 'replace', 'append'}
, default'fail'
fail
: If table exists, do nothing.
replace
: If table exists, drop it, recreate it, and insert data.
append
: If table exists, insert data. Create if does not exist.
Although the write_frame
docs currently suggest it only works on sqlite, mysql appears to be supported and in fact there is quite a bit of mysql testing in the codebase.
Andy Hayden mentioned the correct function (to_sql
). In this answer, I'll give a complete example, which I tested with Python 3.5 but should also work for Python 2.7 (and Python 3.x):
First, let's create the dataframe:
# Create dataframeimport pandas as pdimport numpy as npnp.random.seed(0)number_of_samples = 10frame = pd.DataFrame({ 'feature1': np.random.random(number_of_samples), 'feature2': np.random.random(number_of_samples), 'class': np.random.binomial(2, 0.1, size=number_of_samples), },columns=['feature1','feature2','class'])print(frame)
Which gives:
feature1 feature2 class0 0.548814 0.791725 11 0.715189 0.528895 02 0.602763 0.568045 03 0.544883 0.925597 04 0.423655 0.071036 05 0.645894 0.087129 06 0.437587 0.020218 07 0.891773 0.832620 18 0.963663 0.778157 09 0.383442 0.870012 0
To import this dataframe into a MySQL table:
# Import dataframe into MySQLimport sqlalchemydatabase_username = 'ENTER USERNAME'database_password = 'ENTER USERNAME PASSWORD'database_ip = 'ENTER DATABASE IP'database_name = 'ENTER DATABASE NAME'database_connection = sqlalchemy.create_engine('mysql+mysqlconnector://{0}:{1}@{2}/{3}'. format(database_username, database_password, database_ip, database_name))frame.to_sql(con=database_connection, name='table_name_for_df', if_exists='replace')
One trick is that MySQLdb doesn't work with Python 3.x. So instead we use mysqlconnector
, which may be installed as follows:
pip install mysql-connector==2.1.4 # version avoids Protobuf error
Output:
Note that to_sql
creates the table as well as the columns if they do not already exist in the database.
You can do it by using pymysql:
For example, let's suppose you have a MySQL database with the next user, password, host and port and you want to write in the database 'data_2', if it is already there or not.
import pymysqluser = 'root'passw = 'my-secret-pw-for-mysql-12ud'host = '172.17.0.2'port = 3306database = 'data_2'
If you already have the database created:
conn = pymysql.connect(host=host, port=port, user=user, passwd=passw, db=database, charset='utf8')data.to_sql(name=database, con=conn, if_exists = 'replace', index=False, flavor = 'mysql')
If you do NOT have the database created, also valid when the database is already there:
conn = pymysql.connect(host=host, port=port, user=user, passwd=passw)conn.cursor().execute("CREATE DATABASE IF NOT EXISTS {0} ".format(database))conn = pymysql.connect(host=host, port=port, user=user, passwd=passw, db=database, charset='utf8')data.to_sql(name=database, con=conn, if_exists = 'replace', index=False, flavor = 'mysql')
Similar threads: