Parsing local HTML file using New-Object -ComObject "HTMLFile" broken? Parsing local HTML file using New-Object -ComObject "HTMLFile" broken? powershell powershell

Parsing local HTML file using New-Object -ComObject "HTMLFile" broken?


This seems to work properly if you provide a UCS-2 byte array instead of a string:

$html = New-Object -ComObject "HTMLFile"$src = Get-Content -path "./passwordreminder.html" -Raw$src = [System.Text.Encoding]::Unicode.GetBytes($src)try{    # This works in PowerShell 4    $html.IHTMLDocument2_write($src)}catch{    # This works in PowerShell 5    $html.write($src)}


You could try with an Internet Explorer COM object:

$ie = New-Object -COM 'InternetExplorer.Application'$ie.Navigate("file://$($PWD.Path)/passwordreminder.html")do {  Start-Sleep -Milliseconds 100} until ($ie.ReadyState -eq 4)# do stuff

I don't have PowerShell v5, though, so I can't test. If HTMLFile is broken, this might be as well.

You can call the Navigate() method (and the loop waiting for it to complete loading the page) in an outer loop if you need to run it repeatedly.

$ie = New-Object -COM 'InternetExplorer.Application'foreach (...) {  $ie.Navigate("file://$($PWD.Path)/passwordreminder.html")  do {    Start-Sleep -Milliseconds 100  } until ($ie.ReadyState -eq 4)  # do stuff}


This code snippet works by adding the .NET Framework's mshtml.HTMLDocumentClass type via the Add-Type -AssemblyName cmdlet.

Add-Type -AssemblyName "Microsoft.mshtml"$html = New-Object -ComObject "HTMLFile"$svc = Get-Service | Select-Object Name, Status | ConvertTo-Html$svc | Out-File -FilePath .\report.html -Force$htmlFile = Get-Content -Path .\report.html -Raw$html.IHTMLDocument2_write($htmlFile)

The $html variable contains the "HTMLFile" object reference with all its methods and properties.