Delete repeated text between delimeters Delete repeated text between delimeters bash bash

Delete repeated text between delimeters


That's a job for awk:

awk 'seen[$0]{next}{seen[$0]=1}1' RS='%' ORS='%' fortune

RS='%' means we are using % as the record separator.

seen[$0] checks if we already have seen this value. $0 is the whole record, the fortune's text, as string. If we've seen the value we are moving to the next record and don't print anything.

{seen[$0]=1} adds the record to the lookup table. 1 prints the current record since it is always true. Note that this code gets only executed when we've not seen the record before, because of the next statement before.

ORS='%' set's the output record separator to %.


Awk can handle it. Set the record separator to "%\n" and then print unique entries:

awk 'BEGIN{RS="%\n"} { if (! ($0 in fortunes)) { fortunes[$0]++; print $0 "%"} }' data%This is sample fortune 1%This is sample fortune 2%This fortune is repeated%This is sample fortune 3%This fortuneis unique%$