PHP Goutte / CURL - Complete ASPX Form PHP Goutte / CURL - Complete ASPX Form curl curl

PHP Goutte / CURL - Complete ASPX Form


well, I'm not familiar with goutte, but using this package w3zone/crawler I've made a quick example to scrap the content of that link:

install it using:

composer require w3zone/Crawler

then use it for your case as follows:

require_once __DIR__ . '/vendor/autoload.php';use w3zone\Crawler\{Crawler, Services\phpCurl};$crawler = new Crawler(new phpCurl);$link = 'https://wyobiz.wy.gov/Business/FilingSearch.aspx';$homePage = $crawler->get($link)->run();preg_match('#<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="(.*?)"\s*/>#', $homePage['body'], $viewState);preg_match('#<input type="hidden" name="__VIEWSTATEGENERATOR" id="__VIEWSTATEGENERATOR" value="(.*?)"\s*/>#', $homePage['body'], $viewGen);preg_match('#<input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="(.*?)"\s*/>#', $homePage['body' ], $eventVal);$postData = array(    '__VIEWSTATE' => $viewState[1],    '__LASTFOCUS' => '',    '__EVENTTARGET' => '',    '__EVENTARGUMENT' => '',    '__VIEWSTATEGENERATOR' => $viewGen[1],    '__EVENTVALIDATION' => $eventVal[1],    'ctl00$MainContent$myScriptManager' => 'MainContent_myScriptManager',    'ctl00$MainContent$txtFilingName' => 'test',    'ctl00$MainContent$searchOpt' => 'chkSearchStartWith',    'ctl00$MainContent$txtFilingID' => '',    'ctl00$MainContent$cmdSearch' => 'Search',    '__ASYNCPOST' => 'true',    'ctl00$MainContent$myScriptManager' => 'ctl00$MainContent$UpdatePanel1|ctl00$MainContent$cmdSearch',);$response = $crawler->post(['url' => $link, 'data' => $postData])->dumpHeaders()->run();echo "<textarea style='width: 90%; height: 200px;'>".$response['body']."</textarea>";


The problem for me was the ASP asynchronous response isn't HTML - it's text with HTML in it:

<html>1|#||4|6079|updatePanel|ctl00_MainContentPlaceHolder_ucLicenseLookup_UpdtPanelGridLookup|                <div class="modal-window-lookup-results fade bs-example-modal-lg in">                    <div class="modal-header">    [...]</html>  

So, when goutte feeds it to browser-kit, it breaks. Goutte doesn't suck - you just can't feed it non-HTML garbage.

To get around this in a hurry I just did:

$crawler = $client->request('POST', $url, $params);// this is a broken crawler because response is not html!$html = $client->getResponse()->getContent();$html = substr($html, strpos($html, "<div"));$html = substr($html, 0, strpos($html, "|hiddenField|")-3);$html = "<!DOCTYPE html><html>$html</html>";$crawler = new \Symfony\Component\DomCrawler\Crawler($html);print $crawler->html();