Unable to make my script process locally created server response in the right way
I ran both the scripts, and they run as intended. So my findings :
downloader/exception_type_count/twisted.internet.error.ConnectionRefusedError
there is no means to get through this error without permission of the server, here i.e ebay.Logs from scrapy:
2019-05-25 07:28:41 [scrapy.statscollectors] INFO: Dumping Scrapy stats:{'downloader/exception_count': 72,'downloader/exception_type_count/twisted.internet.error.ConnectionRefusedError': 64,'downloader/exception_type_count/twisted.web._newclient.ResponseNeverReceived': 8,'downloader/request_bytes': 55523,'downloader/request_count': 81,'downloader/request_method_count/GET': 81,'downloader/response_bytes': 2448476,'downloader/response_count': 9,'downloader/response_status_count/200': 9,'finish_reason': 'shutdown','finish_time': datetime.datetime(2019, 5, 25, 1, 58, 41, 234183),'item_scraped_count': 8,'log_count/DEBUG': 90,'log_count/INFO': 9,'request_depth_max': 1,'response_received_count': 9,'retry/count': 72,'retry/reason_count/twisted.internet.error.ConnectionRefusedError': 64,'retry/reason_count/twisted.web._newclient.ResponseNeverReceived': 8,'scheduler/dequeued': 81,'scheduler/dequeued/memory': 81,'scheduler/enqueued': 131,'scheduler/enqueued/memory': 131,'start_time': datetime.datetime(2019, 5, 25, 1, 56, 57, 751009)}2019-05-25 07:28:41 [scrapy.core.engine] INFO: Spider closed (shutdown)
you can see only 8
items scraped. These are just the logos and other unrestricted things.
Server Log
:s://.ebaystatic.com http://.ebay.com https://*.ebay.com". Either the 'unsafe-inline' keyword, a hash ('sha256-40GZDfucnPVwbvI/Q1ivGUuJtX8krq8jy3tWNrA/n58='), or a nonce ('nonce-...') is required to enable inline execution.", source: https://vi.vipr.ebaydesc.com/ws/eBayISAPI.dll?ViewItemDescV4&item=323815597324&t=0&tid=10&category=169291&seller=wardrobe-ltd&excSoj=1&excTrk=1&lsite=0&ittenable=false&domain=ebay.com&descgauge=1&cspheader=1&oneClk=1&secureDesc=1 (1)
Ebay does not allow you to scrape itself.
So how to complete your task >>
Everytime before scraping check
/robots.txt
for the same site.For ebay its : http://www.ebay.com/robots.txtAnd you can see almost everything is disallowed.User-agent: *Disallow: /*rt=ncDisallow: /b/LH_Disallow: /brw/Disallow: /clp/Disallow: /clt/store/Disallow: /csc/Disallow: /ctg/Disallow: /ctm/Disallow: /dsc/Disallow: /edc/Disallow: /feed/Disallow: /gsr/Disallow: /gwc/Disallow: /hcp/Disallow: /itc/Disallow: /lit/Disallow: /lst/ng/Disallow: /lvx/Disallow: /mbf/Disallow: /mla/Disallow: /mlt/Disallow: /myb/Disallow: /mys/Disallow: /prp/Disallow: /rcm/Disallow: /sch/%7CDisallow: /sch/LH_Disallow: /sch/aop/Disallow: /sch/ctg/Disallow: /sl/nodeDisallow: /sme/Disallow: /soc/Disallow: /talk/Disallow: /tickets/Disallow: /today/Disallow: /trylater/Disallow: /urw/write-review/Disallow: /vsp/Disallow: /ws/Disallow: /sch/modules=SEARCH_REFINEMENTS_MODEL_V2Disallow: /b/modules=SEARCH_REFINEMENTS_MODEL_V2Disallow: /itm/_nkwDisallow: /itm/?fitsDisallow: /itm/&fitsDisallow: /cta/
Therefore go to https://developer.ebay.com/api-docs/developer/static/developer-landing.html and check their docs, there are easier example code in their site to get the items needs without scraping.