What's a good tool to screen-scrape with Javascript support? [closed] What's a good tool to screen-scrape with Javascript support? [closed] javascript javascript

What's a good tool to screen-scrape with Javascript support? [closed]


You could use Selenium or Watir to drive a real browser.

Ther are also some JavaScript-based headless browsers:

  • PhantomJS is a headless Webkit browser.
    • pjscrape is a scraping framework based on PhantomJS and jQuery.
    • CasperJS is a navigation scripting & testing utility bsaed on PhantomJS, if you need to do a little more than point at URLs to be scraped.
  • Zombie for Node.js

Personally, I'm most familiar with Selenium, which has support for writing automation scripts in a good number of languagues and has more mature tooling, such as the excellent Selenium IDE extension for Firefox, which can be used to write and run testcases, and can export test scripts to many languages.


Using HtmlUnit is also a possibility.

HtmlUnit is a "GUI-Less browser for Java programs". It models HTML documents and provides an API that allows you to invoke pages, fill out forms, click links, etc... just like you do in your "normal" browser.

It has fairly good JavaScript support (which is constantly improving) and is able to work even with quite complex AJAX libraries, simulating either Firefox or Internet Explorer depending on the configuration you want to use.

It is typically used for testing purposes or to retrieve information from web sites.


Selenium now wraps htmlunit so you donĀ“t need start a browser anymore. The new WebDriver api is very easy to use too. The first example use htmlunit driver