Scraping a JSON response with Scrapy Scraping a JSON response with Scrapy json json

Scraping a JSON response with Scrapy


It's the same as using Scrapy's HtmlXPathSelector for html responses. The only difference is that you should use json module to parse the response:

class MySpider(BaseSpider):    ...    def parse(self, response):         jsonresponse = json.loads(response.text)         item = MyItem()         item["firstName"] = jsonresponse["firstName"]                      return item

Hope that helps.


Don't need to use json module to parse the reponse object.

class MySpider(BaseSpider):...def parse(self, response):     jsonresponse = response.json()     item = MyItem()     item["firstName"] = jsonresponse.get("firstName", "")                return item


The possible reason JSON is not loading is that it has single-quotes before and after. Try this:

json.loads(response.body_as_unicode().replace("'", '"'))