Why can Linux accept sockets in multiprocessing?
On unix platforms sockets and other file descriptors can be sent to a different process using unix domain (AF_UNIX) sockets, so sockets can be pickled in the context of multiprocessing.
The multiprocessing module uses a special pickler instance instead of a regular pickler, ForkingPickler, to pickle sockets and file descriptors which then can be unpickled in a different process. It's only possible to do this because it is known where the pickled instance will be unpickled, it wouldn't make sense to pickle a socket or file descriptor and send it between machine boundaries.
For windows there are similar mechanisms for open file handles.
I think the issue is that multiprocessing
uses a different pickler for Windows and non-Windows systems. On Windows, there is no real fork()
, and the pickling that is done is equivalent to pickling across machine boundaries (i.e. distributed computing). On non-Windows systems, objects (like file descriptors) can be shared across process boundaries. Thus, pickling on Windows systems (with pickle
) is more limited.
The multiprocessing
package does use copy_reg
to register a few object types to pickle
, and one of those types is a socket
. However, the serialization of the socket
object that is used on Windows is more limited due to the Windows pickler being weaker.
On a related note, if you do want to send a socket
object with multiprocessing
on Windows, you can… you just have to use the package multiprocess
, which uses dill
instead of pickle
. dill
has a better serializer that can pickle socket
objects on any OS, and thus sending the socket
object with multiprocess
works in either case.
dill
has the function copy
; essentially loads(dumps(object))
-- which is useful for checking an object can be serialized. dill
also has check
, which performs copy
but with the more restrictive "Windows" style fork-like operation. This allows users on non-Windows systems to emulate a copy
on a Windows system, or across distributed resources.
>>> import dill>>> import socket>>> s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)>>> s.connect(('www.python.org', 80))>>> s.sendall(b'GET / HTTP/1.1\rnHost: www.python.org\r\n\r\n')>>> >>> dill.copy(s)<socket._socketobject object at 0x10e55b9f0>>>> dill.check(s)<socket._socketobject object at 0x1059628a0>>>>
In short, the difference is caused by the pickler that multiprocessing
uses on Windows being different than the pickler it uses on non-Windows systems. However, it is possible (and easy) to have work on any OS by using a better serializer (as is used in multiprocess
).