How to extract text between two words in unix? How to extract text between two words in unix? unix unix

How to extract text between two words in unix?


The command in the question (sed -n "/am/,/sed/p", note the added slash) means:

  • Find a line containing the string am
  • and print (p) until a line containing sed occurs

Therefore it prints:

I am using basic grep expression

because it contains am. If you would add some more lines they will be printed, too, until a line containing sed occurs.

E.g.:

echo -e 'I am using basic grep expression.\nOne more line\nOne with sed\nOne without' | sed -n "/am/,/sed/p"

results in:

I am using basic grep expression.One more lineOne with sed

I think - what you want to do is something like that:

sed -n "s/.*\(am.*sed\).*/\1/p"

Example:

echo 'I am using basic grep expression.' | sed -n "s/.*\(am.*sed\).*/\1/p"echo 'I am using basic sed expression.' | sed -n "s/.*\(am.*sed\).*/\1/p"sed -n "s/.*\(am.*sed\).*/\1/p"


You have to use slightly different sed command like:

sed -n '/am/{:a; /am/x; $!N; /sed/!{$!ba;}; /sed/{s/\n/ /gp;}}' file

To print ONLY lines that contain text am and sed spanned across multiple lines.


When Using SED this can work but it's quite an overwhelming syntax...if you need to crop part of a multi-line (\n) text, you might want to try a simpler way using grep:

cat multi_line.txt | grep -oP '(?s)(?<=START phrase).*(?=END phrase)'

For example, I find this as the easiest way to grab perforce changelist description (without rest of CL info):

p4 describe {CL NUMBER} | grep -oP '(?s).*(?=Affected files)'

Note, you can play with the <= and >= to include or not include, the starting/ending phrases in the output.