Prevent CSS/other resource download in PhantomJS/Selenium driven by Python
A bold young soul by the name of “watsonmw” recently added functionality to Ghostdriver (which Phantom.js uses to interface with Selenium) that allows access to Phantom.js API calls which require a page object, like the onResourceRequested
one you cited.
For a solution at all costs, consider building from source (which developers note “takes roughly 30 minutes ... with 4 parallel compile jobs on a modern machine”) and integrating his patch, linked above.
Then this (untested) Python code should work as a proof of concept:
from selenium import webdriverdriver = webdriver.PhantomJS('phantomjs')# hack while the python interface lagsdriver.command_executor._commands['executePhantomScript'] = ('POST', '/session/$sessionId/phantom/execute')driver.execute('executePhantomScript', {'script': '''page.onResourceRequested = function(requestData, request) { // ...}''', 'args': []})
Until then, you’ll just get a Can't find variable: page
exception.
Good luck! There are a lot of great alternatives, like working in a Javascript environment, driving Gecko, proxies, etc.
Will's answer got me on track. (Thanks Will!)
Current PhantomJS (1.9.8) includes Ghostdriver 1.1.0 which already contains watsonmw's patch.
You need to download the latest PhantomJS, perform the following (sudo
may be required):
ln -s path/to/bin/phantomjs /usr/local/share/phantomjsln -s path/to/bin/phantomjs /usr/local/bin/phantomjsln -s path/to/bin/phantomjs /usr/bin/phantomjs
And then try this:
from selenium import webdriverdriver = webdriver.PhantomJS('phantomjs')# hack while the python interface lagsdriver.command_executor._commands['executePhantomScript'] = ('POST', '/session/$sessionId/phantom/execute')driver.execute('executePhantomScript', {'script': ''' var page = this; // won't work otherwise page.onResourceRequested = function(requestData, request) { // ...}''', 'args': []})
Proposed solutions didn't work for me, but this one works (it uses driver.execute_script):
driver.command_executor._commands['executePhantomScript'] = ('POST', '/session/$sessionId/phantom/execute')driver.execute_script(''' this.onResourceRequested = function(request, net) { console.log('REQUEST ' + request.url); };''')