Why can Linux accept sockets in multiprocessing? Why can Linux accept sockets in multiprocessing? linux linux

Why can Linux accept sockets in multiprocessing?


On unix platforms sockets and other file descriptors can be sent to a different process using unix domain (AF_UNIX) sockets, so sockets can be pickled in the context of multiprocessing.

The multiprocessing module uses a special pickler instance instead of a regular pickler, ForkingPickler, to pickle sockets and file descriptors which then can be unpickled in a different process. It's only possible to do this because it is known where the pickled instance will be unpickled, it wouldn't make sense to pickle a socket or file descriptor and send it between machine boundaries.

For windows there are similar mechanisms for open file handles.


I think the issue is that multiprocessing uses a different pickler for Windows and non-Windows systems. On Windows, there is no real fork(), and the pickling that is done is equivalent to pickling across machine boundaries (i.e. distributed computing). On non-Windows systems, objects (like file descriptors) can be shared across process boundaries. Thus, pickling on Windows systems (with pickle) is more limited.

The multiprocessing package does use copy_reg to register a few object types to pickle, and one of those types is a socket. However, the serialization of the socket object that is used on Windows is more limited due to the Windows pickler being weaker.

On a related note, if you do want to send a socket object with multiprocessing on Windows, you can… you just have to use the package multiprocess, which uses dill instead of pickle. dill has a better serializer that can pickle socket objects on any OS, and thus sending the socket object with multiprocess works in either case.

dill has the function copy; essentially loads(dumps(object)) -- which is useful for checking an object can be serialized. dill also has check, which performs copy but with the more restrictive "Windows" style fork-like operation. This allows users on non-Windows systems to emulate a copy on a Windows system, or across distributed resources.

>>> import dill>>> import socket>>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)>>> s.connect(('www.python.org', 80))>>> s.sendall(b'GET / HTTP/1.1\rnHost: www.python.org\r\n\r\n')>>> >>> dill.copy(s)<socket._socketobject object at 0x10e55b9f0>>>> dill.check(s)<socket._socketobject object at 0x1059628a0>>>> 

In short, the difference is caused by the pickler that multiprocessing uses on Windows being different than the pickler it uses on non-Windows systems. However, it is possible (and easy) to have work on any OS by using a better serializer (as is used in multiprocess).