Importing a CSV file into a sqlite3 database table using Python Importing a CSV file into a sqlite3 database table using Python python python

Importing a CSV file into a sqlite3 database table using Python


import csv, sqlite3con = sqlite3.connect(":memory:") # change to 'sqlite:///your_filename.db'cur = con.cursor()cur.execute("CREATE TABLE t (col1, col2);") # use your column names herewith open('data.csv','r') as fin: # `with` statement available in 2.5+    # csv.DictReader uses first line in file for column headings by default    dr = csv.DictReader(fin) # comma is default delimiter    to_db = [(i['col1'], i['col2']) for i in dr]cur.executemany("INSERT INTO t (col1, col2) VALUES (?, ?);", to_db)con.commit()con.close()


Creating an sqlite connection to a file on disk is left as an exercise for the reader ... but there is now a two-liner made possible by the pandas library

df = pandas.read_csv(csvfile)df.to_sql(table_name, conn, if_exists='append', index=False)


You're right that .import is the way to go, but that's a command from the SQLite3 command line program. A lot of the top answers to this question involve native python loops, but if your files are large (mine are 10^6 to 10^7 records), you want to avoid reading everything into pandas or using a native python list comprehension/loop (though I did not time them for comparison).

For large files, I believe the best option is to use subprocess.run() to execute sqlite's import command. In the example below, I assume the table already exists, but the csv file has headers in the first row. See .import docs for more info.

subprocess.run()

from pathlib import Pathdb_name = Path('my.db').resolve()csv_file = Path('file.csv').resolve()result = subprocess.run(['sqlite3',                         str(db_name),                         '-cmd',                         '.mode csv',                         '.import --skip 1 ' + str(csv_file).replace('\\','\\\\')                                 +' <table_name>'],                        capture_output=True)

edit note: sqlite3's .import command has improved so that it can treat the first row as header names or even skip the first x rows (requires version >=3.32, as noted in this answer. If you have an older version of sqlite3, you may need to first create the table, then strip off the first row of the csv before importing. The --skip 1 argument will give an error prior to 3.32

Explanation
From the command line, the command you're looking for is sqlite3 my.db -cmd ".mode csv" ".import file.csv table". subprocess.run() runs a command line process. The argument to subprocess.run() is a sequence of strings which are interpreted as a command followed by all of it's arguments.

  • sqlite3 my.db opens the database
  • -cmd flag after the database allows you to pass multiple follow on commands to the sqlite program. In the shell, each command has to be in quotes, but here, they just need to be their own element of the sequence
  • '.mode csv' does what you'd expect
  • '.import --skip 1'+str(csv_file).replace('\\','\\\\')+' <table_name>' is the import command.
    Unfortunately, since subprocess passes all follow-ons to -cmd as quoted strings, you need to double up your backslashes if you have a windows directory path.

Stripping Headers

Not really the main point of the question, but here's what I used. Again, I didn't want to read the whole files into memory at any point:

with open(csv, "r") as source:    source.readline()    with open(str(csv)+"_nohead", "w") as target:        shutil.copyfileobj(source, target)