Python performance - best parallelism approach Python performance - best parallelism approach python python

Python performance - best parallelism approach


When using parallelism in Python a good approach is to use either ThreadPoolExecutor or ProcessPoolExecutor fromhttps://docs.python.org/3/library/concurrent.futures.html#module-concurrent.futuresthese work well in my experience.

an example of threadedPoolExecutor that can be adapted for your use.

import concurrent.futuresimport urllib.requestimport timeIPs= ['168.212. 226.204',        '168.212. 226.204',        '168.212. 226.204',        '168.212. 226.204',        '168.212. 226.204']def send_pkt(x):  status = 'Failed'  while True:    #send pkt    time.sleep(10)    status = 'Successful'    break  return statuswith concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:    future_to_ip = {executor.submit(send_pkt, ip): ip for ip in IPs}    for future in concurrent.futures.as_completed(future_to_ip):        ip = future_to_ip[future]        try:            data = future.result()        except Exception as exc:            print('%r generated an exception: %s' % (ip, exc))        else:            print('%r send %s' % (url, data))


Your result in option 3: "due to excessive quantity of process the VM where I am running the script freezes (of course, 1500 process running)" could bear further investigation. I believe it may be underdetermined from the information gathered so far whether this is better characterized as a shortcoming of the multiprocessing approach, or a limitation of the VM.

One fairly simple and straightforward approach would be to run a scaling experiment: rather than either having all sends happen from individual processes or all from the same, try intermediate values. Time it how long it takes to split the workload in half between two processes, or 4, 8, so on.

While doing that it may also be a good idea to run a tool like xperf on Windows or oprofile on Linux to record whether these different choices of parallelism are leading to different kinds of bottlenecks, for example thrashing the CPU cache, running the VM out of memory, or who knows what else. Easiest way to say is to try it.

Based on prior experience with these types of problems and general rules of thumb, I would expect the best performance to come when the number of multiprocessing processes is less than or equal to the number of available CPU cores (either on the VM itself or on the hypervisor). That is however assuming that the problem is CPU bound; it's possible performance would still be higher with more than #cpu processes if something blocks during packet sending that would allow better use of CPU time if interleaved with other blocking operations. Again though, we don't know until some profiling and/or scaling experiments are done.


You are correct that python is single-threaded, however your desired task (sending network packets) is considered IO-bound operation, therefor a good candidate for multi-threading. Your main thread is not busy while the packets are transmitting, as long as your write your code with async in mind.

Take a look at the python docs on async tcp networking - https://docs.python.org/3/library/asyncio-protocol.html#tcp-echo-client.