How numbered a file based on the string in first column in unix? How numbered a file based on the string in first column in unix? unix unix

How numbered a file based on the string in first column in unix?


Following awk may help you on same:

awk '!a[$1]++{count++}  {print count,$0}'   Input_file

Output will be as follows:

1 rs665   XP_011539469.11 rs665   XP_016856394.12 rs980   NP_001284363.12 rs980   XP_016856698.13 rs1115  NP_001191785.14 rs1250  NP_067652.1

Solution 2nd: Adding 1 more solution too here now, this considers if your Input_file is sorted as per first column then we need not to create an array as above solution:

awk 'prev!=$1 || !prev{count++}  {print count,$0;prev=$1}'   Input_file


If you don't have the assurance that the symbols are grouped together in consecutive runs, you better make it something like:

awk 'function intern(sym) { if (sym in table)                              return table[sym]                            return table[sym] = ++counter }     { print intern($1), $1, $2 }'

This will work even if the input happens to be:

rs665   XP_011539469.1rs980   NP_001284363.1rs665   XP_016856394.1rs980   XP_016856698.1rs1115  NP_001191785.1rs1250  NP_067652.1

Both cases of rs665 map to 1 and both rs980 cases map to 2.

This requires memory to hold the table of known symbols.


awk 'function intern(sym) { if (sym in table && $3 ~/x/                              return table[sym]                            return table[sym] = ++counter} { print intern($2"\t"$3"\t"$4"\t"$5"\t"$6), $0 }        function intern2(sym) { if (sym in table && $3 ~/y/)                                  return table[sym]                            return table[sym] = ++counter}     { print intern2($3"\t"$4), $0 }' "input.tab" > "output.tab";

Based on this answer I'm try to do something similar. In this case I would like to numbered the file in first column for each row depends on the string within one column. So, e.g. if third column is == "x", numbered taking account a set of columns and if is == "y" taking account other set of columns to numbered. It would be possible to implement rebuilding the previous script? I'm trying to do with conditions and works but not correctly. Thanks anyway in advance for the previous answer @Kaz.