Scrape ajax web page with python and/or scrapy

The 'random link' looks like it has the form:

https://petitions.whitehouse.gov/signatures/more/petitionid/ pagenum/ lastpetitionwhere petitionid is static for a single petition, pagenum increments each time and lastpetition is returned each time from the request.

My usual approach would be to use the requests library to emulate a session for cookies and then work out what requests the browser is making.

import requestss=requests.session()url='http://httpbin.org/get'params = {'cat':'Persian',          'age':3,          'name':'Furball'}             s.get(url, params=params)

I'd pay particular attention to the following link:

<a href="/petition/shut-down-tar-sands-project-utah-it-begins-and-reject-keystone-xl-pipeline/H1MQJGMW?page=2&last=50b5a1f9ee140f227a00000b" class="load-next no-follow active" rel="50ae9207eab72aed25000003">Load Next 20 Signatures</a>

python ajax web-scraping scrapy

It's hard to fully emulate Jquery/Javascript with Python.You could have a look on spidermonkeyor on web-testing automation tools like Selenium , which can fully automate any browser actions.Previous question on SO:How can Python work with javascript

CodeHunter

Scrape ajax web page with python and/or scrapy

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last