Python requests with multithreading Python requests with multithreading multithreading multithreading

Python requests with multithreading


Install the grequests module which works with gevent (requests is not designed for async):

pip install grequests

Then change the code to something like this:

import grequestsclass Test:    def __init__(self):        self.urls = [            'http://www.example.com',            'http://www.google.com',             'http://www.yahoo.com',            'http://www.stackoverflow.com/',            'http://www.reddit.com/'        ]    def exception(self, request, exception):        print "Problem: {}: {}".format(request.url, exception)    def async(self):        results = grequests.map((grequests.get(u) for u in self.urls), exception_handler=self.exception, size=5)        print resultstest = Test()test.async()

This is officially recommended by the requests project:

Blocking Or Non-Blocking?

With the default Transport Adapter in place, Requests does not provide any kind of non-blocking IO. The Response.content property will block until the entire response has been downloaded. If you require more granularity, the streaming features of the library (see Streaming Requests) allow you to retrieve smaller quantities of the response at a time. However, these calls will still block.

If you are concerned about the use of blocking IO, there are lots of projects out there that combine Requests with one of Python's asynchronicity frameworks. Two excellent examples are grequests and requests-futures.

Using this method gives me a noticable performance increase with 10 URLs: 0.877s vs 3.852s with your original method.