Why are all newlines gone after PowerShell's Get-Content, Regex, and Set-Content?
For the $replacement
variable, you don't really need to specify the type [string]
, PowerShell will infer that from the assignment.
For the $template
variable, [string]
is actually wrong. By default, Get-Content
will give you an array of strings (i.e. lines) instead of one string.
But in fact you don't even want to split the input into lines in the first place. When Set-Content
or Out-File
see an array as their input, they will join it with spaces.
Using -Raw
makes Get-Content
return the entire file as one string, this way also the line endings (like LF
for Linux files) will stay the way they are.
$replacement = "Foo Bar"$template = Get-Content -Path "$pwd\template.sh" -Encoding UTF8 -Raw$template = $template -replace '<REPLACE_ME>', $replacementSet-Content -Path "$pwd\script.sh" -Value $template -Encoding UTF8
PowerShell will save all UTF-8 files with a BOM. If you don't want that, you must use a different utility to write the file:
$UTF8_NO_BOM = New-Object System.Text.UTF8Encoding $False$replacement = "Foo Bar"$template = Get-Content -Path "$pwd\template.sh" -Encoding UTF8 -Raw$template = $template -replace '<REPLACE_ME>', $replacement[System.IO.File]::WriteAllText("$pwd\script.sh", $template, $UTF8_NO_BOM)
Notes:
- PowerShell operators (like
-replace
) silently operate on arrays.$x -replace "search", "replacement"
will perform a replace operation on every member of $x, be that a single string or an array of them. - Recommended reading: PowerShell Set-Content and Out-File what is the difference?
I think you need to use the -Raw
switch with Get-Content
in order to load the file as a single string:
[String] $replacement = "Foo Bar"[String] $template = Get-Content -Path "$pwd\template.sh" -Encoding UTF8 -Raw$template = $template -replace '<REPLACE_ME>', $replacement
To stop the Windows line ending being added to the end of the script, I think you need to use this .NET method for writing the file:
[io.file]::WriteAllText("$pwd\template.sh",$template)
By default PowerShell attempts to convert your input in to an array of strings for each line in the file. I think because of the Unix line endings its not doing this successfully but is subsequently removing the new line characters.
In PowerShell 3.0 we now have a new dynamic parameter, Raw. When specified, Get-Content ignores newline characters and returns the entire contents of a file in one string. Raw is a dynamic parameter, it is available only in file system drives.
Use the -delimiter "`n" option instead of -raw. The -raw option reads/returns the entire content as a single string, although it preserves the new-line characters but it is useless if you need to manipulate the content e.g. skip Header/1st row or skip blank lines etc.
Get-Content - background info:
By default, the Get-Content
cmdlet reads & returns content line-by-line, which means if you pipe a Set-Content
or Add-Content
to instantly write each-line (being read) into the output file - the newline characters are preserved and written as expected, e.g.:
Get-Content $inputFile | Set-Content $outputFilePath
However, if you store the entire content (read) into a $variable, you will receive a single string-array without any separator/delimiter (by default), which means you lose the new-line characters, however, when reading file (using Get-Content
) you can use the -delimiter
option to specify a newline character, e.g.:
Get-Content -Delimiter "`n" $fileToRead
HTH.