Python: Memory leak debugging Python: Memory leak debugging python python

Python: Memory leak debugging


See http://opensourcehacker.com/2008/03/07/debugging-django-memory-leak-with-trackrefs-and-guppy/ . Short answer: if you're running django but not in a web-request-based format, you need to manually run db.reset_queries() (and of course have DEBUG=False, as others have mentioned). Django automatically does reset_queries() after a web request, but in your format, that never happens.


Is DEBUG=False in settings.py?

If not Django will happily store all the SQL queries you make which adds up.


Have you tried gc.set_debug() ?

You need to ask yourself simple questions:

  • Am I using objects with __del__ methods? Do I absolutely, unequivocally, need them?
  • Can I get reference cycles in my code? Can't we break these circles before getting rid of the objects?

See, the main issue would be a cycle of objects containing __del__ methods:

import gcclass A(object):    def __del__(self):        print 'a deleted'        if hasattr(self, 'b'):            delattr(self, 'b')class B(object):    def __init__(self, a):        self.a = a    def __del__(self):        print 'b deleted'        del self.adef createcycle():    a = A()    b = B(a)    a.b = b    return a, bgc.set_debug(gc.DEBUG_LEAK)a, b = createcycle()# remove referencesdel a, b# prints:## gc: uncollectable <A 0x...>## gc: uncollectable <B 0x...>## gc: uncollectable <dict 0x...>## gc: uncollectable <dict 0x...>gc.collect()# to solve this we break explicitely the cycles:a, b = createcycle()del a.bdel a, b# objects are removed correctly:## a deleted## b deletedgc.collect()

I would really encourage you to flag objects / concepts that are cycling in your application and focus on their lifetime: when you don't need them anymore, do we have anything referencing it?

Even for cycles without __del__ methods, we can have an issue:

import gc# class without destructorclass A(object): passdef createcycle():    # a -> b -> c     # ^         |    # ^<--<--<--|    a = A()    b = A()    a.next = b    c = A()    b.next = c    c.next = a    return a, b, bgc.set_debug(gc.DEBUG_LEAK)a, b, c = createcycle()# since we have no __del__ methods, gc is able to collect the cycle:del a, b, c# no panic message, everything is collectable:##gc: collectable <A 0x...>##gc: collectable <A 0x...>##gc: collectable <dict 0x...>##gc: collectable <A 0x...>##gc: collectable <dict 0x...>##gc: collectable <dict 0x...>gc.collect()a, b, c = createcycle()# but as long as we keep an exterior ref to the cycle...:seen = dict()seen[a] = True# delete the cycledel a, b, c# nothing is collectedgc.collect()

If you have to use "seen"-like dictionaries, or history, be careful that you keep only the actual data you need, and no external references to it.

I'm a bit disappointed now by set_debug, I wish it could be configured to output data somewhere else than to stderr, but hopefully that should change soon.