Get a specific column number by column name using awk
You can use the following awk script:
print_col.awk:
# Find the column number in the first line of a file FNR==1{ for(n=1;n<=NF;n++) { if($n == header) { next } }}# Print that column on all other lines{ print $n}
Then use find
to execute this script on every file:
find ... -exec awk -v header="foo" -f print_col.awk {} +
In comments you've asked for a version that could print multiple columns based on their header names. You may use the following script for that:
print_cols.awk:
BEGIN { # Parse headers into an assoc array h split(header, a, ",") for(i in a) { h[a[i]]=1 } }# Find the column numbers in the first line of a fileFNR==1{ split("", cols) # This will re-init cols for(i=1;i<=NF;i++) { if($i in h) { cols[i]=1 } } next}# Print those columns on all other lines{ res = "" for(i=1;i<=NF;i++) { if(i in cols) { s = res ? OFS : "" res = res "" s "" $i } } if (res) { print res } }
Call it like this:
find ... -exec awk -v header="foo,bar,test" -f print_cols.awk {} +