Run 4 concurrent instances of a python script on a folder of data files

python multithreading bash xargs

You can use the multiprocessing-module. I suppose you have a list of files to process and a function to call for each file. Then you could simply use a worker-pool like this:

from multiprocessing import Pool, cpu_countpool = Pool(processes=cpu_count)pool.map(process_function, file_list, chunksize=1)

If your process_function doesn't return a value, you can simply ignore the return-value.

python multithreading bash xargs

Take a look at xargs. It's -P option offers a configurable degree of parallelism. Specifically, something like this should work for you:

ls files* | awk '{print $1,$1".out"}' | xargs -P 4 -n 2 python fastq_groom.py

python multithreading bash xargs

Give this a shot:

#!/bin/bashfiles=( * )for((i=0;i<${#files[@]};i+=4)); do  {      python fastq_groom.py "${files[$i]}" "${files[$i]}".out &     python fastq_groom.py "${files[$i+1]}" "${files[$i+1]}".out &     python fastq_groom.py "${files[$i+2]}" "${files[$i+2]}".out &     python fastq_groom.py "${files[$i+3]}" "${files[$i+3]}".out &  }done

The following puts all files into an array named files. It then executes and backgrounds four python processes on the first four files. As soon as all four of those processes are complete, it executes the next four. It's not as efficient as always keeping a queue of 4 going but if all processes take around the same amount of time, it should be pretty close to that.

Also, please please please don't use the output of ls like that. Just use standard globbing as in for files in *.txt; do ...; done

CodeHunter

Run 4 concurrent instances of a python script on a folder of data files

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last