Shared memory in multiprocessing Shared memory in multiprocessing python python

Shared memory in multiprocessing


Generally speaking, there are two ways to share the same data:

  • Multithreading
  • Shared memory

Python's multithreading is not suitable for CPU-bound tasks (because of the GIL), so the usual solution in that case is to go on multiprocessing. However, with this solution you need to explicitly share the data, using multiprocessing.Value and multiprocessing.Array.

Note that usually sharing data between processes may not be the best choice, because of all the synchronization issues; an approach involving actors exchanging messages is usually seen as a better choice. See also Python documentation:

As mentioned above, when doing concurrent programming it is usually best to avoid using shared state as far as possible. This is particularly true when using multiple processes.

However, if you really do need to use some shared data then multiprocessing provides a couple of ways of doing so.

In your case, you need to wrap l1, l2 and l3 in some way understandable by multiprocessing (e.g. by using a multiprocessing.Array), and then pass them as parameters.
Note also that, as you said you do not need write access, then you should pass lock=False while creating the objects, or all access will be still serialized.


Because this is still a very high result on google and no one else has mentioned it yet, I thought I would mention the new possibility of 'true' shared memory which was introduced in python version 3.8.0: https://docs.python.org/3/library/multiprocessing.shared_memory.html

I have here included a small contrived example (tested on linux) where numpy arrays are used, which is likely a very common use case:

# one dimension of the 2d array which is shareddim = 5000import numpy as npfrom multiprocessing import shared_memory, Process, Lockfrom multiprocessing import cpu_count, current_processimport timelock = Lock()def add_one(shr_name):    existing_shm = shared_memory.SharedMemory(name=shr_name)    np_array = np.ndarray((dim, dim,), dtype=np.int64, buffer=existing_shm.buf)    lock.acquire()    np_array[:] = np_array[0] + 1    lock.release()    time.sleep(10) # pause, to see the memory usage in top    print('added one')    existing_shm.close()def create_shared_block():    a = np.ones(shape=(dim, dim), dtype=np.int64)  # Start with an existing NumPy array    shm = shared_memory.SharedMemory(create=True, size=a.nbytes)    # # Now create a NumPy array backed by shared memory    np_array = np.ndarray(a.shape, dtype=np.int64, buffer=shm.buf)    np_array[:] = a[:]  # Copy the original data into shared memory    return shm, np_arrayif current_process().name == "MainProcess":    print("creating shared block")    shr, np_array = create_shared_block()    processes = []    for i in range(cpu_count()):        _process = Process(target=add_one, args=(shr.name,))        processes.append(_process)        _process.start()    for _process in processes:        _process.join()    print("Final array")    print(np_array[:10])    print(np_array[10:])    shr.close()    shr.unlink()

Note that because of the 64 bit ints this code can take about 1gb of ram to run, so make sure that you won't freeze your system using it. ^_^


If you want to make use of copy-on-write feature and your data is static(unchanged in child processes) - you should make python don't mess with memory blocks where your data lies. You can easily do this by using C or C++ structures (stl for instance) as containers and provide your own python wrappers that will use pointers to data memory (or possibly copy data mem) when python-level object will be created if any at all.All this can be done very easy with almost python simplicity and syntax with cython.

# pseudo cythoncdef class FooContainer:   cdef char * data   def __cinit__(self, char * foo_value):       self.data = malloc(1024, sizeof(char))       memcpy(self.data, foo_value, min(1024, len(foo_value)))      def get(self):       return self.data
# python partfrom foo import FooContainerf = FooContainer("hello world")pid = fork()if not pid:   f.get() # this call will read same memory page to where           # parent process wrote 1024 chars of self.data           # and cython will automatically create a new python string           # object from it and return to caller

The above pseudo-code is badly written. Dont use it. In place of self.data should be C or C++ container in your case.