Understanding shared_memory in Python 3.8 Understanding shared_memory in Python 3.8 python python

Understanding shared_memory in Python 3.8


How does shared_memory circumvent the pickle treatment?

I think you are confusing shared ctypes and shared objects between processes.

First, you don't have to use the sharing mechanisms provided by multiprocessing in order to get shared objects, you can just wrap basic primitives such as mmap / Windows-equivalent or get fancier using any API that your OS/kernel provides you.

Next, the second link you mention regarding how copy is done and how __getstate__ defines the behavior of the pickling is dependent on you — using the sharedctypes module API. You are not forced to perform pickling to share memory between two processes.

In fact, sharedctypes is backed by anonymous shared memory which uses: https://github.com/python/cpython/blob/master/Lib/multiprocessing/heap.py#L31

Both implementations relies on an mmap-like primitive.

Anyway, if you try to copy something using sharedctype, you will hit:

And this function is using ForkingPickler which will make use of pickle and then… ultimately, you'll call __getstate__ somewhere.

But it's not relevant with shared_memory, because shared_memory is not really a ctype-like object.

You have other ways to share objects between processes, using the Resource Sharer / Tracker API: https://github.com/python/cpython/blob/master/Lib/multiprocessing/resource_sharer.py which will rely on pickle serialization/deserialization.

But you don't share shared memory through shared memory, right?

When you use: https://github.com/python/cpython/blob/master/Lib/multiprocessing/shared_memory.py

You create a block of memory with a unique name, and all processes must have the unique name before sharing the memory, otherwise you will not be able to attach it.

Basically, the analogy is:

You have a group of friends and you all have a unique secret base that only you have the location, you will go on errands, be away from each other, but you can all meet at this unique location.

In order for this to work, you must all know the location before going away from each other. If you do not have it beforehand, you are not certain that you will be able to figure out the place to meet them.

That is the same with the shared_memory, you only need its name to open it. You don't share / transfer shared_memory between processes. You read into shared_memory using its unique name from multiple processes.

As a result, why would you pickle it? You can. You can absolutely pickle it. But that might not be built-in, because it's straightforward to just send the unique name to all your processes through another shared memory channel or anything like that.

There is no circumvention required here. ShareableList is just an example of application of SharedMemory class. As you can see it here: https://github.com/python/cpython/blob/master/Lib/multiprocessing/shared_memory.py#L314

It requires something akin to a unique name, you can use anonymous shared memory also and transmit its name later through another channel (write a temporary file, send it back to some API, whatever).

Why then is CreateFileMapping \ OpenFileMapping needed here?

Because it depends on your Python interpreter, here you are might be using CPython, which is doing the following:

https://github.com/python/cpython/blob/master/Modules/mmapmodule.c#L1440

It's already using CreateFileMapping indirectly so that doing CreateFileMapping then attaching it is just duplicating the already-done work in CPython.

But, what about others interpreters? Do all interpreters perform the necessary to make mmap work on non-POSIX platforms? Maybe the rationale of the developer would be this.

Anyway, it is not surprising that mmap would work out of the box.