Unix - randomly select lines based on column values Unix - randomly select lines based on column values unix unix

Unix - randomly select lines based on column values


First grep all the files with a certain number, shuffle them and pick the first 10 using shuf -n 10.

for i in {1..10}; do    grep " ${i}$" file | shuf -n 10done > randomFile

If you don't have shuf, use sort -R to randomly sort them instead:

for i in {1..10}; do    grep " ${i}$" file | sort -R | head -10done > randomFile


If you can use awk, you can do the same with a one-liner

sort -R file | awk '{if (count[$3] < 10) {count[$3]++; print $0}}'