Pipeline For Downloading and Processing Files In Unix/Linux Environment With Perl Pipeline For Downloading and Processing Files In Unix/Linux Environment With Perl unix unix

Pipeline For Downloading and Processing Files In Unix/Linux Environment With Perl


Why to do this using perl. use bash instead. Below is just a sample.

#!/bin/bashfor file in foo1 foo2 foo3do    wget http://samedomain.com/$file.gz .    if [ -f $file.gz ];    then        ./myscript.sh $file.gz >> output.txt    fidone


Try combining the commands using &&, so that the 2nd one runs only after the 1st one completes successfully.

system("(nohup wget $file  && ./myscript.sh $file >> output.txt) &");


If you want parallel processing, you can do it yourself with forking, or use a built in module to handle it for you. Try Parallel::ForkManager. You can see a bit more on it's usage in How can I manage a fork pool in Perl?, but the CPAN page for the module will have the real useful info. You probably want something like this:

use Parallel::ForkManager;my $MAX_PROCESSES = 8; # 8 parallel processes maxmy $pm = new Parallel::ForkManager($MAX_PROCESSES);my @files = glob("foo*.gz");foreach $file (@all_data) {  # Forks and returns the pid for the child:  my $pid = $pm->start and next;   my $downurls = "http://somedomain.com/".$file;  system("wget $file");  system("./myscript.sh $file >> output.txt");  $pm->finish; # Terminates the child process}print "All done!\n";