Memory usage keep growing with Python's multiprocessing.pool

python memory multiprocessing pool

I had memory issues recently, since I was using multiple times the multiprocessing function, so it keep spawning processes, and leaving them in memory.

Here's the solution I'm using now:

def myParallelProcess(ahugearray):    from multiprocessing import Pool    from contextlib import closing    with closing(Pool(15)) as p:        res = p.imap_unordered(simple_matching, ahugearray, 100)    return res

python memory multiprocessing pool

Simply create the pool within your loop and close it at the end of the loop withpool.close().

python memory multiprocessing pool

Use map_async instead of apply_async to avoid excessive memory usage.

For your first example, change the following two lines:

for index in range(0,100000):    pool.apply_async(worker, callback=dummy_func)

pool.map_async(worker, range(100000), callback=dummy_func)

It will finish in a blink before you can see its memory usage in top. Change the list to a bigger one to see the difference. But note map_async will first convert the iterable you pass to it to a list to calculate its length if it doesn't have __len__ method. If you have an iterator of a huge number of elements, you can use itertools.islice to process them in smaller chunks.

I had a memory problem in a real-life program with much more data and finally found the culprit was apply_async.

P.S., in respect of memory usage, your two examples have no obvious difference.

CodeHunter

Memory usage keep growing with Python's multiprocessing.pool

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last