Tracking Down Memory Leaks in Python
This Page Mostly Obsolete
Most of the stuff on this page is obsolete nowadays, since Python (since version 2.0, I think) now includes a cyclical-reference garbage collector. Leaks like this are still possible, but usually only because an older Python extension module (implementing a container type) is used, one that doesn't adhere to the new GC API.
Long-running processes have a nasty habit of exposing Python's Achilles' Heel: Memory Leaks created by cycles. (objects that point at each other, either directly or circuitously). Reference counting cannot collect cycles. Here's one way to create a cycle: class thing:
pass
refcount(a) refcount(b)
a = thing() 1
b = thing() 1 1
a.other = b 1 2
b.other = a 2 2
del a 1 2
del b 1 1
Objects a
and b
have become immortal.
Large and complex systems may create non-obvious cycles. Here are a few quick hints to avoid various ones that I've run into:
- Be careful with bound method objects. Bound methods are created whenever you refer to a method object through an instance. This happens safely every time you call a method; but a common programming style ('functional') passes such objects around, stores them in variables, etc... In one case storing a bound method in the object made it immortal. Either
del
this object manually, or change your code. - Tracebacks are very dangerous. To be really safe,
del
a traceback if you have a handle to it. Another good idea is to assignNone
to bothsys.traceback
andsys.exc_traceback
.
Tracebacks can capture a large number of objects, especially within Medusa. For example, any exception handler that is called from within the polling loop (no matter how deeply), should do something like this...def my_method (self):
...otherwise it will capture references to every object in the socket map. I have plugged some really bad leaks this way.
try:
do_something()
except:
try:
ei = sys.exc_info()
[... report error ...]
finally:
del ei - Keep track of your objects. Either keep a count of how many are around (in a __del__ method), or keep a dictionary of the addresses of all outstanding objects. This will help locate leaks.
class thing:
Here is a module that will let you resurrect leaked objects. Using this module should fill you with shame. Make sure no one is looking.
all_things = {}
def __init__ (self):
thing.all_things[id(self)] = 1
def __del__ (self):
del thing.all_things[id(self)]
for addr in thing.all_things.keys():
r = resurrect.conjure (addr)
# examine r...
- Here's the easiest way to find leaking objects: examine the reference count of each of your class objects. This will be roughly equal to the number of extant instance objects.
# -*- Mode: Python; tab-width: 4 -*-
import sys
import types
def get_refcounts():
d = {}
sys.modules
# collect all classes
for m in sys.modules.values():
for sym in dir(m):
o = getattr (m, sym)
if type(o) is types.ClassType:
d[o] = sys.getrefcount (o)
# sort by refcount
pairs = map (lambda x: (x[1],x[0]), d.items())
pairs.sort()
pairs.reverse()
return pairs
def print_top_100():
for n, c in get_refcounts()[:100]:
print '%10d %s' % (n, c.__name__)
if __name__ == '__main__':
top_100()
Notes
An interface to malloc_stats() [Linux]원문: http://www.nightmare.com/medusa/memory-leaks.html
댓글 없음:
댓글 쓰기