peewee.OperationalError: too many SQL variables on upsert of only 150 rows * 8 columns peewee.OperationalError: too many SQL variables on upsert of only 150 rows * 8 columns sqlite sqlite

peewee.OperationalError: too many SQL variables on upsert of only 150 rows * 8 columns


After some investigation, the problem appears to be related with the maximum number of parameters that a sql query can have: SQLITE_MAX_VARIABLE_NUMBER.

To be able to do big bulk inserts I first estimate SQLITE_MAX_VARIABLE_NUMBER and then use it to create chunks in the list of dictionaries I want to insert.

To estimate the value I use this function inspired by this answer:

def max_sql_variables():    """Get the maximum number of arguments allowed in a query by the current    sqlite3 implementation. Based on `this question    `_    Returns    -------    int        inferred SQLITE_MAX_VARIABLE_NUMBER    """    import sqlite3    db = sqlite3.connect(':memory:')    cur = db.cursor()    cur.execute('CREATE TABLE t (test)')    low, high = 0, 100000    while (high - 1) > low:         guess = (high + low) // 2        query = 'INSERT INTO t VALUES ' + ','.join(['(?)' for _ in                                                    range(guess)])        args = [str(i) for i in range(guess)]        try:            cur.execute(query, args)        except sqlite3.OperationalError as e:            if "too many SQL variables" in str(e):                high = guess            else:                raise        else:            low = guess    cur.close()    db.close()    return lowSQLITE_MAX_VARIABLE_NUMBER = max_sql_variables()

Then I use the above variable to slice the data

with database.atomic() as txn:    size = (SQLITE_MAX_VARIABLE_NUMBER // len(data[0])) -1    # remove one to avoid issue if peewee adds some variable    for i in range(0, len(data), size):        table.insert_many(data[i:i+size]).upsert().execute()

An update about execution speed of max_sql_variables.

On a 3 years old Intel machine with 4 cores and 4 Gb of RAM, running OpenSUSE tumbleweed, with SQLITE_MAX_VARIABLE_NUMBER set to 999, the function runs in less that 100ms. If I set high = 1000000, the execution time becomes of the order of 300ms.

On a younger Intel machine with 8 cores and 8Gb of RAM, running Kubuntu, with SQLITE_MAX_VARIABLE_NUMBER set to 250000, the function runs in about 2.6 seconds and returns 99999. If I set high = 1000000, the execution time becomes of the order of 4.5 seconds.


Looking here, https://www.sqlite.org/limits.html#max_column it seems the limit should be 2000:

The SQLITE_MAX_COLUMN compile-time parameter is used to set an upper bound on:

  • ... snip ...
  • The number of values in an INSERT statement

I guess you're bumping against the limit somehow? At any rate, just chunk your input or re-compile SQLite with higher limits.