How to run functions in parallel?

You could use threading or multiprocessing.

Due to peculiarities of CPython, threading is unlikely to achieve true parallelism. For this reason, multiprocessing is generally a better bet.

Here is a complete example:

from multiprocessing import Processdef func1():  print 'func1: starting'  for i in xrange(10000000): pass  print 'func1: finishing'def func2():  print 'func2: starting'  for i in xrange(10000000): pass  print 'func2: finishing'if __name__ == '__main__':  p1 = Process(target=func1)  p1.start()  p2 = Process(target=func2)  p2.start()  p1.join()  p2.join()

The mechanics of starting/joining child processes can easily be encapsulated into a function along the lines of your runBothFunc:

def runInParallel(*fns):  proc = []  for fn in fns:    p = Process(target=fn)    p.start()    proc.append(p)  for p in proc:    p.join()runInParallel(func1, func2)

python multithreading multiprocessing

If your functions are mainly doing I/O work (and less CPU work) and you have Python 3.2+, you can use a ThreadPoolExecutor:

from concurrent.futures import ThreadPoolExecutordef run_io_tasks_in_parallel(tasks):    with ThreadPoolExecutor() as executor:        running_tasks = [executor.submit(task) for task in tasks]        for running_task in running_tasks:            running_task.result()run_io_tasks_in_parallel([    lambda: print('IO task 1 running!'),    lambda: print('IO task 2 running!'),])

If your functions are mainly doing CPU work (and less I/O work) and you have Python 2.6+, you can use the multiprocessing module:

from multiprocessing import Processdef run_cpu_tasks_in_parallel(tasks):    running_tasks = [Process(target=task) for task in tasks]    for running_task in running_tasks:        running_task.start()    for running_task in running_tasks:        running_task.join()run_cpu_tasks_in_parallel([    lambda: print('CPU task 1 running!'),    lambda: print('CPU task 2 running!'),])

python multithreading multiprocessing

This can be done elegantly with Ray, a system that allows you to easily parallelize and distribute your Python code.

To parallelize your example, you'd need to define your functions with the @ray.remote decorator, and then invoke them with .remote.

import rayray.init()dir1 = 'C:\\folder1'dir2 = 'C:\\folder2'filename = 'test.txt'addFiles = [25, 5, 15, 35, 45, 25, 5, 15, 35, 45]# Define the functions. # You need to pass every global variable used by the function as an argument.# This is needed because each remote function runs in a different process,# and thus it does not have access to the global variables defined in # the current process.@ray.remotedef func1(filename, addFiles, dir):    # func1() code here...@ray.remotedef func2(filename, addFiles, dir):    # func2() code here...# Start two tasks in the background and wait for them to finish.ray.get([func1.remote(filename, addFiles, dir1), func2.remote(filename, addFiles, dir2)])

If you pass the same argument to both functions and the argument is large, a more efficient way to do this is using ray.put(). This avoids the large argument to be serialized twice and to create two memory copies of it:

largeData_id = ray.put(largeData)ray.get([func1(largeData_id), func2(largeData_id)])

Important - If func1() and func2() return results, you need to rewrite the code as follows:

ret_id1 = func1.remote(filename, addFiles, dir1)ret_id2 = func2.remote(filename, addFiles, dir2)ret1, ret2 = ray.get([ret_id1, ret_id2])

There are a number of advantages of using Ray over the multiprocessing module. In particular, the same code will run on a single machine as well as on a cluster of machines. For more advantages of Ray see this related post.

CodeHunter

How to run functions in parallel?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last