Memory not freed after Python's multiprocessing Pool is finished Memory not freed after Python's multiprocessing Pool is finished python-3.x python-3.x

Memory not freed after Python's multiprocessing Pool is finished


The generation threshold may be getting in the way, take a look at gc.get_threshold()

try including

gc.disable()


Indeed, there is a leak problem, but it does not appear for some magical parameters. I could not understand it, but we can reduce the leak by passing a list to pool.map instead of a ndarray.images_converted = pool.map(rgb2hsv, [i for i in imgs])

This consistently reduces memory leak in my tests.

OLD ANSWER:

It does not seems there is a problem in pool. You should not expect "del pool" on line 31 to free your memory, since what is occupying it are the variables "imgs" and "images_converted". These are in the scope of the function "parallel_convert_all_to_hsv" and not in the scope of "rgb2hsv", so "del pool" is not related to them.

The memory is corrected released after deleting "images" and "images_converted" in lines 56 and 59.


As multithreading.Pool is not able to free up memory of around 1* Gb, I have also tried replacing it with ThreadPool but no better. I am still wondering about this memory leak problem inside Pools.

This may not be the best solution but can be a work-around solution.

By not using ThreadPool or ProcessPool, I am creating Threads or Processes manually and assigning each with the image to convert to HSV. Well, I have commented the line p = multiprocessing.Process(target=do_hsv, args=(imgs[j], shared_list)) because it will spawn new process for each image conversion which I think will be overkill and much expensive than Threads. Obviously, creating threads manually will take some more time (9 sec without memory leak) than ThreadPool (4 sec but with memory leak) but as you can see it almost remains calm on memory.

Here is my code:

import multiprocessingimport osimport threadingimport timefrom memory_profiler import profileimport numpy as npfrom skimage.color import rgb2hsvdef do_hsv(img, shared_list):    shared_list.append(rgb2hsv(img))    # print("Converted by process {} having parent process {}".format(os.getpid(), os.getppid()))@profiledef parallel_convert_all_to_hsv(imgs, shared_list):    cores = os.cpu_count()    starttime = time.time()    for i in range(0, len(imgs), cores):        # print("i :", i)        jobs = []; pipes = []        end = i + cores if (i + cores) <= len(imgs) else i + len(imgs[i : -1]) + 1        # print("end :", end)        for j in range(i, end):            # print("j :", j)            # p = multiprocessing.Process(target=do_hsv, args=(imgs[j], shared_list))            p = threading.Thread(target= do_hsv, args=(imgs[j], shared_list))            jobs.append(p)        for p in jobs: p.start()        for proc in jobs:            proc.join()    print("Took {} seconds to complete ".format(starttime - time.time()))    return 1@profiledef doit():    print("create random images")    max_images = 700    images = np.random.rand(max_images, 300, 300,3)    # images = [x for x in range(0, 10000)]    manager = multiprocessing.Manager()    shared_list = manager.list()    parallel_convert_all_to_hsv(images, shared_list)    del images    del shared_list    print()doit()

Here is the Output:

create random imagesTook -9.085552453994751 seconds to complete Filename: MemoryNotFreed.pyLine #    Mem usage    Increment   Line Contents================================================    15   1549.1 MiB   1549.1 MiB   @profile    16                             def parallel_convert_all_to_hsv(imgs, shared_list):    17                                 18   1549.1 MiB      0.0 MiB       cores = os.cpu_count()    19                                 20   1549.1 MiB      0.0 MiB       starttime = time.time()    21                                 22   1566.4 MiB      0.0 MiB       for i in range(0, len(imgs), cores):    23                                 24                                     # print("i :", i)    25                                 26   1566.4 MiB      0.0 MiB           jobs = []; pipes = []    27                                 28   1566.4 MiB      0.0 MiB           end = i + cores if (i + cores) <= len(imgs) else i + len(imgs[i : -1]) + 1    29                                 30                                     # print("end :", end)    31                                 32   1566.4 MiB      0.0 MiB           for j in range(i, end):    33                                         # print("j :", j)    34                                 35                                         # p = multiprocessing.Process(target=do_hsv, args=(imgs[j], shared_list))    36   1566.4 MiB      0.0 MiB               p = threading.Thread(target= do_hsv, args=(imgs[j], shared_list))    37                                 38   1566.4 MiB      0.0 MiB               jobs.append(p)    39                                 40   1566.4 MiB      0.8 MiB           for p in jobs: p.start()    41                                 42   1574.9 MiB      1.0 MiB           for proc in jobs:    43   1574.9 MiB     13.5 MiB               proc.join()    44                                 45   1563.5 MiB      0.0 MiB       print("Took {} seconds to complete ".format(starttime - time.time()))    46   1563.5 MiB      0.0 MiB       return 1Filename: MemoryNotFreed.pyLine #    Mem usage    Increment   Line Contents================================================    48    106.6 MiB    106.6 MiB   @profile    49                             def doit():    50                                 51    106.6 MiB      0.0 MiB       print("create random images")    52                                 53    106.6 MiB      0.0 MiB       max_images = 700    54                                 55   1548.7 MiB   1442.1 MiB       images = np.random.rand(max_images, 300, 300,3)    56                                 57                                 # images = [x for x in range(0, 10000)]    58   1549.0 MiB      0.3 MiB       manager = multiprocessing.Manager()    59   1549.1 MiB      0.0 MiB       shared_list = manager.list()    60                                 61   1563.5 MiB     14.5 MiB       parallel_convert_all_to_hsv(images, shared_list)    62                                 63    121.6 MiB      0.0 MiB       del images    64                                 65    121.6 MiB      0.0 MiB       del shared_list    66                                 67    121.6 MiB      0.0 MiB       print()