multiprocess or threading in python?

python multithreading multiprocess

If you are truly compute bound, using the multiprocessing module is probably the lightest weight solution (in terms of both memory consumption and implementation difficulty.)

If you are I/O bound, using the threading module will usually give you good results. Make sure that you use thread safe storage (like the Queue) to hand data to your threads. Or else hand them a single piece of data that is unique to them when they are spawned.

PyPy is focused on performance. It has a number of features that can help with compute-bound processing. They also have support for Software Transactional Memory, although that is not yet production quality. The promise is that you can use simpler parallel or concurrent mechanisms than multiprocessing (which has some awkward requirements.)

Stackless Python is also a nice idea. Stackless has portability issues as indicated above. Unladen Swallow was promising, but is now defunct. Pyston is another (unfinished) Python implementation focusing on speed. It is taking an approach different to PyPy, which may yield better (or just different) speedups.

python multithreading multiprocess

Tasks runs like sequentially but you have the illusion that are run in parallel. Tasks are good when you use for file or connection I/O and because are lightweights.

Multiprocess with Pool may be the right solution for you because processes runs in parallel so are very good with intensive computing because each process run in one CPU (or core).

Setup multiprocess may be very easy:

from multiprocessing import Pooldef worker(input_item):    output = do_some_work()    return outputpool = Pool() # it make one process for each CPU (or core) of your PC. Use "Pool(4)" to force to use 4 processes, for example.list_of_results = pool.map(worker, input_list) # Launch all automatically

python multithreading multiprocess

For small collections of data, simply create subprocesses with subprocess.Popen.

Each subprocess can simply get it's piece of data from stdin or from command-line arguments, do it's processing, and simply write the result to an output file.

When the subprocesses have all finished (or timed out), you simply merge the output files.

Very simple.

CodeHunter

multiprocess or threading in python?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last