How to extract one column from multiple files, and paste those columns into one file?

Here's one way using awk and a sorted glob of files:

awk '{ a[FNR] = (a[FNR] ? a[FNR] FS : "") $5 } END { for(i=1;i<=FNR;i++) print a[i] }' $(ls -1v *)

Results:

1 8 a2 9 b3 10 c4 11 d5 12 e6 13 f7 14 g

Explanation:

For each line of input of each input file:
- Add the files line number to an array with a value of column 5.
- (a[FNR] ? a[FNR] FS : "") is a ternary operation, which is set up to build up the arrays value as a record. It simply asks if the files line number is already in the array. If so, add the arrays value followed by the default file separator before adding the fifth column. Else, if the line number is not in the array, don't prepend anything, just let it equal the fifth column.
At the end of the script:
- Use a C-style loop to iterate through the array, printing each of the arrays values.

linux shell paste

For only ~4000 files, you should be able to do:

 find . -name sample_problem*_part*.txt | xargs paste

If find is giving names in the wrong order, pipe it to sort:

 find . -name sample_problem*_part*.txt | sort ... | xargs paste

linux shell paste

# print filenames in sorted orderfind -name sample\*.txt | sort |# extract 5-th column from each file and print it on a single linexargs -n1 -I{} sh -c '{ cut -s -d " " -f 5 $0 | tr "\n" " "; echo; }' {} |# transposepython transpose.py ?

where transpose.py:

#!/usr/bin/env python"""Write lines from stdin as columns to stdout."""import sysfrom itertools import izip_longestmissing_value = sys.argv[1] if len(sys.argv) > 1 else '-'for row in izip_longest(*[column.split() for column in sys.stdin],                         fillvalue=missing_value):    print " ".join(row)

Output

1 8 a2 9 b3 10 c4 11 d5 ? e6 ? f? ? g

Assuming the first and second files have less lines than the third one (missing values are replaced by '?').

CodeHunter

How to extract one column from multiple files, and paste those columns into one file?

Output

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last