Redis Queue + python-rq: Right pattern to prevent high memory usage? Redis Queue + python-rq: Right pattern to prevent high memory usage? heroku heroku

Redis Queue + python-rq: Right pattern to prevent high memory usage?


After two more days of playing around, I have found the problem. I would like to share this with you, along with the tools that were helpful:

Core Problem

The actual problem was that we had overlooked to cast an object to a string before saving it to the PostgreSQL database. Without this cast, the string representation ended up in the DB (due to the __str__() function of the respective object returning exactly the representation we wanted); however, to Redis, the whole object was passed. After passing it to Redis, the associated task crashed with an UnpickleError exception. This consumed 5 MB RAM that were not freed up after the crash.

Additional Actions

To reduce memory footprint further, we implemented the following supplementary actions (mind that we are saving everything to a separate DB so the results that Redis saves are not used at all in our application):

  • We set the TTL of the task result to 0 with the call enqueue_call([...] result_ttl=0)
  • We defined a custom Exception handler - black_hole - to take all exceptions and return False. This prevents Redis from moving a task to the failed queue where it would still use a bit of memory. Exceptions are beforehand sent via e-mail to us to keep track of them.

Useful tools along the way:

We just worked with redis-cli.

  • redis-cli info | grep used_memory_human --> shows current memory usage. ideal to compare memory footprint before and after a task was executed.
  • redis-cli keys '*' --> shows all current keys that exist. This overview led me to the insight that some tasks are not deleted even though they should have been (as written above, they crashed with an UnpickleError and because of this were not removed).
  • redis-cli monitor --> shows a realtime overview of what is happening in Redis. This helped me find out that the objects that were moved back and forth were too massive.
  • redis-cli debug object <key> --> shows a dump of the key's value.
  • redis-cli hgetall <key> --> shows a more readable dump of the key's value (especially useful for the specific use case of using Redis purely as task queue, since it seems that the tasks are created by python-rq in this format.

Furthermore, I can answer some of the questions I had posted above:

From the docs I know that the 500 sec TTL means that a key is then "expired", but not really deleted. Does the key still consume memory at this point? Can I somehow change this behavior?

Actually, they are deleted, just as the docs imply.

Does it have something to do with the failed queue (which apparently does not have a TTL attached to the jobs, meaning (I think) that these are kept forever)?

Surprisingly, the jobs for which Redis itself crashed were not moved to the Failed Queue, they were just "abandoned", meaning the values remained but RQ didn't care about it the normal way it does with failed jobs.

Relevant Documentation


If you are using the "Black Hole" exception handler from http://python-rq.org/docs/exceptions/, you should also add job.cancel() there:

def black_hole(job, *exc_info):    # Delete the job hash on redis, otherwise it will stay on the queue forever    job.cancel()    return False


A thing that wasn't immediately obvious to me is that an RQ job has both 'description' and 'data' properties. If not specified, the description is set as a string representation of the data, which in my case was unnecessarily verbose. Explicitly setting the description to a short summary saved me that overhead.

enqueue(func, longdata, description='short job summary')