Use Powershell and Regex to extract block of lines from a text file Use Powershell and Regex to extract block of lines from a text file powershell powershell

Use Powershell and Regex to extract block of lines from a text file


Wiktor Stribiżew provided the crucial pointer in a comment on the question[1]: You must use Get-Content -Raw to read the file contents into a single string so your regex can match across lines:

if ((Get-Content -Raw C:\Config.txt) -match '(?ms)^999.*?(?=\r?\n\S|\Z)') {   $Matches[0]  # automatic variable $Matches reflects what was captured}

The regex needed some tweaking, too, including the use of non-greedy quantifier .*?, as suggested by TheMadTechnician:

  • (?ms) sets regex options m (treats ^ and $ as line anchors) and s (makes . match \n (newlines) too`.

  • ^999.*? matches any line starting with 999 and any subsequent characters non-greedily.

  • (?=\r?\n\S|\Z) is a positive look-ahead assertion ((?=...)) that matches either a newline (\r?\n) followed by a non-whitespace character (\S) - assumed to be the start of the next block - or (|) the very end of the input (\Z) - in effect, this matches either the end of the file or the start of the next block , but without including it in in the match recorded in $Matches.


[1] Wiktor also suggests regex (?m)^999.*(?:\r?\n.*){2}, which works well with the sample input, but is limited to blocks that have exactly 3 lines - by contrast, the solution presented here finds blocks of any length, as long as the non-initial block lines all have leading whitespace.