Paste column to existing file in a loop Paste column to existing file in a loop bash bash

Paste column to existing file in a loop


Would it be feasible to split the operation in 2 ? One step for generating all the intermediate files; and another for generating all the final output file. The idea is to avoid rereading and rewriting over and over the final file.

The changes to the script would be something like this:

while [ $i -le $max ]do    n=$(printf "%05d" $i)    # to preserve lexical order if $max > 9    # create text from grib2    wgrib2 -d 1.$(($i+1)) -no_header myGribFile.grb2 -text tmptxt$n.txt    ((i++))done#make final filepaste -d, existingfile.csv tmptxt[0-9]*.txt > tmpcsv.csv  #overwrite old csv with new csvmv tmpcsv.csv existingfile.csv


Assuming the number of lines output by the program is constant and is equal to number of lines in existingfile.csv (which should be the case since you are using paste)

Disclaimer: I'm not exactly sure if this would speed things up (depending on whether io redirection >> writes to the file exactly once or not). Anyway give it a try and let me know.

So the basic idea is

  1. append the output in one go after the loop is done (note the change: wgrib now prints to - which is stdout)

  2. use awk to move every linenum rows (linenum being the number of lines in existingfile.csv) to the end to the first linenum rows

    Save to tempcsv.csv (because I can't find a way to save in the same file)

  3. rename to / overwrite existingfile.csv

.

while [ $i -le $max ]; do  # create text from grib2  wgrib2 -d 1.$(($i+1)) -no_header myGribFile.grb2 -text -  ((i++))done >> existingfile.csvawk -v linenum=4 '  { array[FNR%linenum]=array[FNR%linenum]","$0 }   END { for(i=1;i<linenum;i++) print array[i%linenum] }' existingfile.csv > tempcsv.csvmv tempcsv.csv existingfile.csv

If this is how I imagine it would work (internally), you should have 2 writes to existingfile.csv instead of $max number of writes. So hopefully this would speed things up.