Get a specific column number by column name using awk Get a specific column number by column name using awk unix unix

Get a specific column number by column name using awk


You can use the following awk script:

print_col.awk:

# Find the column number in the first line of a file FNR==1{    for(n=1;n<=NF;n++) {        if($n == header) {            next        }    }}# Print that column on all other lines{    print $n}

Then use find to execute this script on every file:

find ... -exec awk -v header="foo" -f print_col.awk {} +

In comments you've asked for a version that could print multiple columns based on their header names. You may use the following script for that:

print_cols.awk:

BEGIN {    # Parse headers into an assoc array h    split(header, a, ",")    for(i in a) {        h[a[i]]=1    }   }# Find the column numbers in the first line of a fileFNR==1{    split("", cols) # This will re-init cols    for(i=1;i<=NF;i++) {        if($i in h) {            cols[i]=1        }    }       next}# Print those columns on all other lines{    res = ""    for(i=1;i<=NF;i++) {        if(i in cols) {            s = res ? OFS : ""            res = res "" s "" $i        }    }       if (res) {        print res     }   }

Call it like this:

find ... -exec awk -v header="foo,bar,test" -f print_cols.awk {} +