Prevent CSS/other resource download in PhantomJS/Selenium driven by Python Prevent CSS/other resource download in PhantomJS/Selenium driven by Python selenium selenium

Prevent CSS/other resource download in PhantomJS/Selenium driven by Python


A bold young soul by the name of “watsonmw” recently added functionality to Ghostdriver (which Phantom.js uses to interface with Selenium) that allows access to Phantom.js API calls which require a page object, like the onResourceRequested one you cited.

For a solution at all costs, consider building from source (which developers note “takes roughly 30 minutes ... with 4 parallel compile jobs on a modern machine”) and integrating his patch, linked above.

Then this (untested) Python code should work as a proof of concept:

from selenium import webdriverdriver = webdriver.PhantomJS('phantomjs')# hack while the python interface lagsdriver.command_executor._commands['executePhantomScript'] = ('POST', '/session/$sessionId/phantom/execute')driver.execute('executePhantomScript', {'script': '''page.onResourceRequested = function(requestData, request) {    // ...}''', 'args': []})

Until then, you’ll just get a Can't find variable: page exception.

Good luck! There are a lot of great alternatives, like working in a Javascript environment, driving Gecko, proxies, etc.


Will's answer got me on track. (Thanks Will!)

Current PhantomJS (1.9.8) includes Ghostdriver 1.1.0 which already contains watsonmw's patch.

You need to download the latest PhantomJS, perform the following (sudo may be required):

ln -s path/to/bin/phantomjs  /usr/local/share/phantomjsln -s path/to/bin/phantomjs  /usr/local/bin/phantomjsln -s path/to/bin/phantomjs  /usr/bin/phantomjs

And then try this:

from selenium import webdriverdriver = webdriver.PhantomJS('phantomjs')# hack while the python interface lagsdriver.command_executor._commands['executePhantomScript'] = ('POST', '/session/$sessionId/phantom/execute')driver.execute('executePhantomScript', {'script': '''    var page = this; // won't work otherwise    page.onResourceRequested = function(requestData, request) {    // ...}''', 'args': []})


Proposed solutions didn't work for me, but this one works (it uses driver.execute_script):

driver.command_executor._commands['executePhantomScript'] = ('POST', '/session/$sessionId/phantom/execute')driver.execute_script('''    this.onResourceRequested = function(request, net) {        console.log('REQUEST ' + request.url);    };''')