Scraping a JSON response with Scrapy
It's the same as using Scrapy's HtmlXPathSelector
for html responses. The only difference is that you should use json
module to parse the response:
class MySpider(BaseSpider): ... def parse(self, response): jsonresponse = json.loads(response.text) item = MyItem() item["firstName"] = jsonresponse["firstName"] return item
Hope that helps.
Don't need to use json
module to parse the reponse object.
class MySpider(BaseSpider):...def parse(self, response): jsonresponse = response.json() item = MyItem() item["firstName"] = jsonresponse.get("firstName", "") return item
The possible reason JSON is not loading is that it has single-quotes before and after. Try this:
json.loads(response.body_as_unicode().replace("'", '"'))