AWK Script - What does this script do? AWK Script - What does this script do? unix unix

AWK Script - What does this script do?


it looks like it's trimming all trailing pipe chars on the first and last lines only.


Wow, whoever wrote this must have been paid by the line.

The block of code that occurs twice, from len = length(line) to line = substr(line,1,newlen-1), is doing a string transformation that could be simply (and more clearly) expressed as a regular expression replacement. It's calculating the number of | characters at the end of line and stripping them. When the line ends with a character other than |, one character is stripped (this may be accidental). This could be simply performed as gsub(/(\|+|.)$/, "", line), or gsub(/\|+)$/, "", line) if the behavior with no final | doesn't matter.

As for the overall structure, there are three parts in the code: what's done for the first line (if (NR == 1) {…}, what's done for other lines (else {…}), and what's done after the last line (END {…}). On the first line, the variable line is set to $0 transformed. On subsequent lines, the saved line is printed then line is set to the current line. Finally the last line is printed, transformed. This print-previous-then-save-current pattern is a common trick to act differently on the last line: when you read a line, you can't know whether it's the last one, so you save it, print the previous line and move on; in the END block you do that different thing for the last line.

Here's how I'd write it. The data flow is similarly nontrivial (but hardly contrived either), but at least it's not drowned in a messy text transformation.

function cleanup (line) { gsub(/(\|+|.)$/, "", line); return line }NR != 1 { print prev }{ prev = (NR == 1 ? cleanup($0) : $0) }END { print cleanup(prev) }


I may be wrong but on quick glance it seems to filter out the | caracter in a file.