Get all keys in Redis database with python
Use scan_iter()
scan_iter()
is superior to keys()
for large numbers of keys because it gives you an iterator you can use rather than trying to load all the keys into memory.
I had a 1B records in my redis and I could never get enough memory to return all the keys at once.
SCANNING KEYS ONE-BY-ONE
Here is a python snippet using scan_iter()
to get all keys from the store matching a pattern and delete them one-by-one:
import redisr = redis.StrictRedis(host='localhost', port=6379, db=0)for key in r.scan_iter("user:*"): # delete the key r.delete(key)
SCANNING IN BATCHES
If you have a very large list of keys to scan - for example, larger than >100k keys - it will be more efficient to scan them in batches, like this:
import redisfrom itertools import izip_longestr = redis.StrictRedis(host='localhost', port=6379, db=0)# iterate a list in batches of size ndef batcher(iterable, n): args = [iter(iterable)] * n return izip_longest(*args)# in batches of 500 delete keys matching user:*for keybatch in batcher(r.scan_iter('user:*'),500): r.delete(*keybatch)
I benchmarked this script and found that using a batch size of 500 was 5 times faster than scanning keys one-by-one. I tested different batch sizes (3,50,500,1000,5000) and found that a batch size of 500 seems to be optimal.
Note that whether you use the scan_iter()
or keys()
method, the operation is not atomic and could fail part way through.
DEFINITELY AVOID USING XARGS ON THE COMMAND-LINE
I do not recommend this example I found repeated elsewhere. It will fail for unicode keys and is incredibly slow for even moderate numbers of keys:
redis-cli --raw keys "user:*"| xargs redis-cli del
In this example xargs creates a new redis-cli process for every key! that's bad.
I benchmarked this approach to be 4 times slower than the first python example where it deleted every key one-by-one and 20 times slower than deleting in batches of 500.
import redisr = redis.Redis("localhost", 6379)for key in r.scan_iter(): print key
using Pyredis library
Available since 2.8.0.
Time complexity: O(1) for every call. O(N) for a complete iteration, including enough command calls for the cursor to return back to 0. N is the number of elements inside the collection..