How to parse multiline with awk without weird duplicates How to parse multiline with awk without weird duplicates shell shell

How to parse multiline with awk without weird duplicates


Could you please try following.

awk 'BEGIN{  OFS="\t"}/^Name/{  if(value){     print value  }  sub(/.*: /,"")  value=$0  next}/^Description/{  sub(/.*: /,"")  value=(value?value OFS:"")$0}END{  if(value){    print value  }}'  Input_file


You were close:

$ awk -F': ' '/^Name/ {n=$2} /^Desc/ {print n "\t" $2}' filezvbi    VBI capture and decoding libraryzziplib A lightweight library that offers the ability to easily extract data from files archived in a single zip file

The main problem with your script was that the {print...} block was being executed for every line of input rather than just when the Descline was seen and then you weren't including the blank after the : in your FS so it was still present in each field.


And I've written written this sed script:

pacman -Qi | sed -n '/^\(Name\|Description\)[[:space:]]*: /{s///;H}; /^$/ba; $ba; d; :a;x;s/^\n//;s/\n/\t/;p'
  1. /^\(Name\|Description\)[[:space:]]*: /{s///;H}; /^$/ba - Each line that starts with Name and Description has the macthed part removed and is appneded to hold space.
  2. /^$/ba; $ba; d; - If an empty line or an end of file is encountered, we branch to label a; Otherwise we start new cycle.
  3. :a;x;s/^\n//;s/\n/\t/;p - In label a we exchange hold space with pattern space, remove leading newline (dunno where it's from), substitute first newline with a tab, and print the output.

Sample output:

zlib    Compression library implementing the deflate compression method found in gzip and PKZIPzsh A very advanced and programmable command interpreter (shell) for UNIXzstd    Zstandard - Fast real-time compression algorithmzvbi    VBI capture and decoding libraryzziplib A lightweight library that offers the ability to easily extract data from files archived in a single zip file