Streaming with gunicorn Streaming with gunicorn flask flask

Streaming with gunicorn


I am answering my own question after doing some more research.

gunicorn server:app -k gevent

This uses asynchronous workers, which have the benefit of using Connection: keep-alive when serving requests. This allows the request to be served indefinitely.


Consider using the built-in BaseHTTPServer instead of gunicorn. The following example launches 100 handler threads on the same port, with each handler started through BaseHTTPServer. It streams fine, supports multiple connections on 1 port, and generally runs 2X faster than gunicorn too. And you can wrap your socket in SSL if you want that too.

import time, threading, socket, SocketServer, BaseHTTPServerclass Handler(BaseHTTPServer.BaseHTTPRequestHandler):    def do_GET(self):        if self.path != '/':            self.send_error(404, "Object not found")            return        self.send_response(200)        self.send_header('Content-type', 'text/html; charset=utf-8')        self.end_headers()        # serve up an infinite stream        i = 0        while True:            self.wfile.write("%i " % i)            time.sleep(0.1)            i += 1# Create ONE socket.addr = ('', 8000)sock = socket.socket (socket.AF_INET, socket.SOCK_STREAM)sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)sock.bind(addr)sock.listen(5)# Launch 100 listener threads.class Thread(threading.Thread):    def __init__(self, i):        threading.Thread.__init__(self)        self.i = i        self.daemon = True        self.start()    def run(self):        httpd = BaseHTTPServer.HTTPServer(addr, Handler, False)        # Prevent the HTTP server from re-binding every handler.        # https://stackoverflow.com/questions/46210672/        httpd.socket = sock        httpd.server_bind = self.server_close = lambda self: None        httpd.serve_forever()[Thread(i) for i in range(100)]time.sleep(9e9)

If you insist on using gunicorn anyway, remember to put it (and all its related packages: wsgi, gevent, flask) in a virtualenv to avoid conflicts with other software.


Gunicorn processes are sending "messages" to master process to let it know they are still alive (see https://github.com/benoitc/gunicorn/blob/master/gunicorn/workers/workertmp.py#L40). However this is not done during response serving (for example see https://github.com/benoitc/gunicorn/blob/master/gunicorn/workers/sync.py#L160) so if it takes longer then timeout the master process kills the worker.