Multiprocessing and Multithreading Multiprocessing and Multithreading multithreading multithreading

Multiprocessing and Multithreading


Instead of passing the subprocess.Popen (which will run them serially instead of in parallel when it is first defined), pass the command:

import timeimport subprocessfrom multiprocessing import Processcmd1 = ['nice', 'time', 'java', '-Xmx6G', '-jar', '/comparison/old_picard/MarkDuplicates.jar', 'I=/comparison/old.bam', 'O=/comparison/old_picard/markdups/old.dupsFlagged.bam', 'M=/comparison/old_picard/markdups/old.metrics.txt', 'TMP_DIR=/comparison', 'VALIDATION_STRINGENCY=LENIENT', 'ASSUME_SORTED=true']cmd2 = ['nice', 'time', 'java', '-Xmx6G', '-jar', '/comparison/new_picard/MarkDuplicates.jar', 'I=/comparison/new.bam', 'O=/comparison/new_picard/markdups/new.dupsFlagged.bam', 'M=/comparison/new_picard/markdups/new.metrics.txt', 'TMP_DIR=/comparison', 'VALIDATION_STRINGENCY=LENIENT', 'ASSUME_SORTED=true']def timeit(cmd):    print cmd    past = time.time()    p = subprocess.Popen(cmd, stdout=subprocess.PIPE)    results = [p.communicate()]    present = time.time()    total = present - past    results.append(total)    return resultsp1 = Process(target=timeit, args=(cmd1,))p2 = Process(target=timeit, args=(cmd2,))for p in (p1, p2):    p.start()for p in (p1, p2):    p.join()

ETA: While the above solution is the way to do multiprocessing in general, @Jordan is exactly right that you shouldn't use this approach to time two versions of software. Why not run them sequentially?


I think something like this should work:

p1 = Process(target=timeit, args=(c1,))p2 = Process(target=timeit, args=(c2,))p1.start()p2.start()p1.join()p2.join()

I don't know where your iteration error is though (what line is it?).

Also, I think you'd be better off running them separately. When you're running them together, you run the risk of one process being given more CPU time and appearing to be faster, even when it isn't.