Find number, and remove adjacent characters equal to this number Find number, and remove adjacent characters equal to this number unix unix

Find number, and remove adjacent characters equal to this number


Taking the question literally, this removes the next n characters from fields 2 and 4 for any n embedded in the field.

perl -lane 'for $i (1, 3) {@nums = $F[$i] =~ /(\d+)/g; for $num (@nums) {$F[$i] =~ s/$num.{$num}//}}; print join("\t", @F)'

The other answers remove the number and all the characters that follow that are the same.

To illustrate the difference between my answer and the others, use the following input:

6    ccg8qqqqqqqqqqqqggg    10 ccccg3qqqqqqqqqqqggggg

My version outputs this:

6    ccgqqqqggg     10      ccccgqqqqqqqqggggg

while theirs output this:

6    ccgggg    10 ccccgggggg


With perl:

perl -pe 's/\d+([^\d\s])\1*//g'


With sed:

sed 's/[0-9]\+\([a-z]\)\1*//g'

The match finds any string of digits ([0-9]+) followed by any letter ([a-z]). The \1* matches any subsequent occurrences of that character. The /g (global) modifier makes sure that the replace is done more than once per line.