*large* python dictionary with persistence storage for quick look-ups
If you want to persist a large dictionary, you are basically looking at a database.
Python comes with built in support for sqlite3, which gives you an easy database solution backed by a file on disk.
In principle the shelve module does exactly what you want. It provides a persistent dictionary backed by a database file. Keys must be strings, but shelve will take care of pickling/unpickling values. The type of db file can vary, but it can be a Berkeley DB hash, which is an excellent light weight key-value database.
Your data size sounds huge so you must do some testing, but shelve/BDB is probably up to it.
Note: The bsddb module has been deprecated. Possibly shelve will not support BDB hashes in future.
No one has mentioned dbm. It is opened like a file, behaves like a dictionary and is in the standard distribution.
From the docs https://docs.python.org/3/library/dbm.html
import dbm# Open database, creating it if necessary.with dbm.open('cache', 'c') as db: # Record some values db[b'hello'] = b'there' db['www.python.org'] = 'Python Website' db['www.cnn.com'] = 'Cable News Network' # Note that the keys are considered bytes now. assert db[b'www.python.org'] == b'Python Website' # Notice how the value is now in bytes. assert db['www.cnn.com'] == b'Cable News Network' # Often-used methods of the dict interface work too. print(db.get('python.org', b'not present')) # Storing a non-string key or value will raise an exception (most # likely a TypeError). db['www.yahoo.com'] = 4# db is automatically closed when leaving the with statement.
I would try this before any of the more exotic forms, and using shelve/pickle will pull everything into memory on loading.
Cheers
Tim