Powershell magnling ascii text
I can confirm that your commands do inexplicably result in extra line breaks in the output file, in the start and in the end. Powershell also converts the tabs in the original file into four spaces instead.
While I cannot explain why, these commands do the same thing without these issues:
Try this code instead:
Get-Content -Path C:\Windows\System32\drivers\etc\hosts -Encoding Ascii | Where-Object { -not $_.Contains("thereisnolinelikethis") } | Out-File -FilePath "c:\temp\testfile" -Encoding Ascii
I think this is more of an issue with PowerShell's F&O (formatting & output) engine. Keep in mind that Select-String
outputs a rich object called MatchInfo. When that object reaches the end of the output it needs to be rendered to a string. I think it is that rendering/formatting that injects the extra line. One of the properties on MatchInfo is the line that was matched (or notmatched). If you pass just the Line property down the pipe, it seems to work better (hashes match):
Get-Content C:\Windows\system32\drivers\etc\hosts | Select-String -notmatch "thereisnolinelikethis" | Foreach {$_.Line} | Out-File -Encoding ascii c:\temp\testfile
BTW you only need to specify ASCII encoding when outputting back to the file. Everywhere else in PowerShell, just let the string flow as Unicode.
All that said, I would use Where-Object
instead of Select-String
for this scenario. Where-Object is a filtering command which is what you want. Select-String takes input of one form (string) and converts it to a different object (MatchInfo).
Out-File
adds a trailing NewLine ("`r`n"
) to the testfile
file.
C:\Windows\System32\drivers\etc\hosts
does not contain a trailing newline out of the box, which is why you get a different FileHash
If you open the files with a StreamReader
, you'll see that the underlying stream differs in length (due to the trailing newline in the new file):
PS C:\> $Hosts = [System.IO.StreamReader]"C:\Windows\System32\drivers\etc\hosts"PS C:\> $Tests = [System.IO.StreamReader]"C:\temp\testfile"PS C:\> $Hosts.BaseStream.Length822PS C:\> $Tests.BaseStream.Length824PS C:\> $Tests.BaseStream.Position = 822; $Tests.Read(); $Tests.Read()1310
ASCII characters 13 (0x0D
) and 10 (0x0A
) correspond to [System.Environment]::NewLine
or CR+LF