How to collect data from multiple pages into single data structure with scrapy How to collect data from multiple pages into single data structure with scrapy json json

How to collect data from multiple pages into single data structure with scrapy


here is a way you need to deal. you need to yield/return item once when item has all attributes

yield Request(page1,              callback=self.page1_data)def page1_data(self, response):    hxs = HtmlXPathSelector(response)    i = TestItem()    i['name']='name'    i['age']='age'    url_profile_page = 'url to the profile page'    yield Request(url_profile_page,                  meta={'item':i},    callback=self.profile_page)def profile_page(self,response):    hxs = HtmlXPathSelector(response)    old_item=response.request.meta['item']    # parse other fileds    # assign them to old_item    yield old_item