Why close a cursor for Sqlite3 in Python
Analysis
CPython _sqlite3.Cursor.close
corresponds to pysqlite_cursor_close
which besides a few sanity checks and marking it as closed, does this:
if (self->statement) { (void)pysqlite_statement_reset(self->statement); Py_CLEAR(self->statement);}
pysqlite_statement_reset
in turn calls sqlite3_reset
from SQLite's C API:
The sqlite3_reset() function is called to reset a prepared statement object back to its initial state, ready to be re-executed. Any SQL statement variables that had values bound to them using the sqlite3_bind_*() API retain their values. Use sqlite3_clear_bindings() to reset the bindings.
[...]
The sqlite3_reset(S) interface does not change the values of any bindings on the prepared statement S.
Prepared Statement Object API is used to bind parameters e.g. in _sqlite3.Cursor.execute
. So if sqlite3_clear_bindings
was used, it might have been able to free up some memory used to store the parameters, but I don't see it called anywhere in CPython/pysqlite.
Experiment
I use memory-profiler to draw a memory usage chart and produce line-by-line reports.
import loggingimport sqlite3import time# For the function brackets to appear on the chart leave this out:## If your Python file imports the memory profiler # "from memory_profiler import profile" these timestamps will not be# recorded. Comment out the import, leave your functions decorated, # and re-run.## from memory_profiler import profileclass CursorCuriosity: cursor_num = 20_000 param_num = 200 def __init__(self): self.conn = sqlite3.connect(':memory:') self.cursors = [] @profile def create(self): logging.info('Creating cursors') sql = 'SELECT {}'.format(','.join(['?'] * self.param_num)) for i in range(self.cursor_num): params = [i] * self.param_num cur = self.conn.execute(sql, params) self.cursors.append(cur) @profile def close(self): logging.info('Closing cursors') for cur in self.cursors: cur.close() @profile def delete(self): logging.info('Destructing cursors') self.cursors.clear() @profile def disconnect(self): logging.info('Disconnecting') self.conn.close() del self.conn@profiledef main(): curcur = CursorCuriosity() logging.info('Sleeping before calling create()') time.sleep(2) curcur.create() logging.info('Sleeping before calling close()') time.sleep(2) curcur.close() logging.info('Sleeping before calling delete()') time.sleep(2) curcur.delete() logging.info('Sleeping before calling disconnect()') time.sleep(2) curcur.disconnect() logging.info('Sleeping before exit') time.sleep(2) if __name__ == '__main__': logging.basicConfig(level='INFO', format='%(asctime)s %(message)s') main()
I run it first with the profile
import commented out to get the plot.
mprof run -T 0.05 cursor_overhead.pymprof plot
Then with the import to get output in the terminal.
mprof run -T 0.05 cursor_overhead.py
Line # Mem usage Increment Occurences Line Contents============================================================ 51 19.1 MiB 19.1 MiB 1 @profile 52 def main(): 53 19.1 MiB 0.0 MiB 1 curcur = CursorCuriosity() 54 55 19.1 MiB 0.0 MiB 1 logging.info('Sleeping before calling create()') 56 19.1 MiB 0.0 MiB 1 time.sleep(2) 57 2410.3 MiB 2391.2 MiB 1 curcur.create() 58 59 2410.3 MiB 0.0 MiB 1 logging.info('Sleeping before calling close()') 60 2410.3 MiB 0.0 MiB 1 time.sleep(2) 61 2410.3 MiB 0.0 MiB 1 curcur.close() 62 63 2410.3 MiB 0.0 MiB 1 logging.info('Sleeping before calling delete()') 64 2410.3 MiB 0.0 MiB 1 time.sleep(2) 65 1972.2 MiB -438.1 MiB 1 curcur.delete() 66 67 1972.2 MiB 0.0 MiB 1 logging.info('Sleeping before calling disconnect()') 68 1972.2 MiB 0.0 MiB 1 time.sleep(2) 69 1872.7 MiB -99.5 MiB 1 curcur.disconnect() 70 71 1872.7 MiB 0.0 MiB 1 logging.info('Sleeping before exit') 72 1872.7 MiB 0.0 MiB 1 time.sleep(2)
And individual method for completeness.
Line # Mem usage Increment Occurences Line Contents============================================================ 24 19.1 MiB 19.1 MiB 1 @profile 25 def create(self): 26 19.1 MiB 0.0 MiB 1 logging.info('Creating cursors') 27 19.1 MiB 0.0 MiB 1 sql = 'SELECT {}'.format(','.join(['?'] * self.param_num)) 28 2410.3 MiB 0.0 MiB 20001 for i in range(self.cursor_num): 29 2410.1 MiB 0.0 MiB 20000 params = [i] * self.param_num 30 2410.3 MiB 2374.3 MiB 20000 cur = self.conn.execute(sql, params) 31 2410.3 MiB 16.9 MiB 20000 self.cursors.append(cur)
Line # Mem usage Increment Occurences Line Contents============================================================ 33 2410.3 MiB 2410.3 MiB 1 @profile 34 def close(self): 35 2410.3 MiB 0.0 MiB 1 logging.info('Closing cursors') 36 2410.3 MiB 0.0 MiB 20001 for cur in self.cursors: 37 2410.3 MiB 0.0 MiB 20000 cur.close()
Line # Mem usage Increment Occurences Line Contents============================================================ 39 2410.3 MiB 2410.3 MiB 1 @profile 40 def delete(self): 41 2410.3 MiB 0.0 MiB 1 logging.info('Destructing cursors') 42 1972.2 MiB -438.1 MiB 1 self.cursors.clear()
Line # Mem usage Increment Occurences Line Contents============================================================ 44 1972.2 MiB 1972.2 MiB 1 @profile 45 def disconnect(self): 46 1972.2 MiB 0.0 MiB 1 logging.info('Disconnecting') 47 1972.2 MiB 0.0 MiB 1 self.conn.close() 48 1872.7 MiB -99.5 MiB 1 del self.conn
Conclusion
- Closing a
sqlite3.Cursor
doesn't free up memory (but does a little but of work, manipulating the SQLite prepared statement's state) - Deleting/destructing a cursor frees up memory
- Deleting/destructing a
sqlite3.Connection
frees up memory (closing doesn't)
In the case of SQLite, there isn't much difference but the API for databases is not just for embedded databases but for all SQL databases.
For a DBMS, a cursor often implies a session in the client and sometimes on the server.
So, if you are not using a reference counting implementation of Python (such as CPython), then a lot of resources could be tied up until GC frees them.