How to save a web page into a HTML file with PowerShell or C#? How to save a web page into a HTML file with PowerShell or C#? google-chrome google-chrome

How to save a web page into a HTML file with PowerShell or C#?


The pages uses AngularJS and also jQuery. It means some contents will be loaded after document ready. So when you send the request using Invoke-WebRequest, you only receive the original content of the page. Other contents will be loaded after a while.

To solve the problem, you can automate IE to get expected result. It's enough to wait fr the page to get ready and also wait a bit to run AngularJs logic and download required content, then get content of document element:

$ie = new-object -ComObject "InternetExplorer.Application"$url = "https://support.microsoft.com/en-us/help/4052574/cumulative-update-2-for-sql-server-2017"$ie.silent = $true$ie.navigate($url)while($ie.Busy) { Start-Sleep -Milliseconds 100 }Start-Sleep 10$ie.Document.documentElement.innerHTML > "C:\Tempfiles\output.html"$ie.Stop()$ie.Quit()


You can use Selenium.WebDriver and Selenium.Chrome.WebDriver packages to download and save html content:

var service = ChromeDriverService.CreateDefaultService();service.HideCommandPromptWindow = true;var options = new ChromeOptions();options.AddArgument("headless");using (var driver = new ChromeDriver(service, options)){    driver.Url = "https://support.microsoft.com/en-us/help/4052574/cumulative-update-2-for-sql-server-2017";    File.WriteAllText("cu2_ps.html", driver.PageSource);}

But that means that you need to have chrome installed. You could use IE driver as well, but then probably better to use IE COM automation as was suggested in another answer.