Delete repeated text between delimeters
That's a job for awk
:
awk 'seen[$0]{next}{seen[$0]=1}1' RS='%' ORS='%' fortune
RS='%'
means we are using %
as the record separator.
seen[$0]
checks if we already have seen this value. $0
is the whole record, the fortune's text, as string. If we've seen the value we are moving to the next record and don't print anything.
{seen[$0]=1}
adds the record to the lookup table. 1
prints the current record since it is always true. Note that this code gets only executed when we've not seen the record before, because of the next
statement before.
ORS='%'
set's the output record separator to %
.
Awk can handle it. Set the record separator to "%\n"
and then print unique entries:
awk 'BEGIN{RS="%\n"} { if (! ($0 in fortunes)) { fortunes[$0]++; print $0 "%"} }' data%This is sample fortune 1%This is sample fortune 2%This fortune is repeated%This is sample fortune 3%This fortuneis unique%$