Architecture Flask vs FastAPI Architecture Flask vs FastAPI python python

Architecture Flask vs FastAPI


This seemed a little interesting, so i ran a little tests with ApacheBench:

Flask

from flask import Flaskfrom flask_restful import Resource, Apiapp = Flask(__name__)api = Api(app)class Root(Resource):    def get(self):        return {"message": "hello"}api.add_resource(Root, "/")

FastAPI

from fastapi import FastAPIapp = FastAPI(debug=False)@app.get("/")async def root():    return {"message": "hello"}

I ran 2 tests for FastAPI, there was a huge difference:

  1. gunicorn -w 4 -k uvicorn.workers.UvicornWorker fast_api:app
  2. uvicorn fast_api:app --reload

So here is the benchmarking results for 5000 requests with a concurrency of 500:

FastAPI with Uvicorn Workers

Concurrency Level:      500Time taken for tests:   0.577 secondsComplete requests:      5000Failed requests:        0Total transferred:      720000 bytesHTML transferred:       95000 bytesRequests per second:    8665.48 [#/sec] (mean)Time per request:       57.700 [ms] (mean)Time per request:       0.115 [ms] (mean, across all concurrent requests)Transfer rate:          1218.58 [Kbytes/sec] receivedConnection Times (ms)              min  mean[+/-sd] median   maxConnect:        0    6   4.5      6      30Processing:     6   49  21.7     45     126Waiting:        1   42  19.0     39     124Total:         12   56  21.8     53     127Percentage of the requests served within a certain time (ms)  50%     53  66%     64  75%     69  80%     73  90%     81  95%     98  98%    112  99%    116 100%    127 (longest request)

FastAPI - Pure Uvicorn

Concurrency Level:      500Time taken for tests:   1.562 secondsComplete requests:      5000Failed requests:        0Total transferred:      720000 bytesHTML transferred:       95000 bytesRequests per second:    3200.62 [#/sec] (mean)Time per request:       156.220 [ms] (mean)Time per request:       0.312 [ms] (mean, across all concurrent requests)Transfer rate:          450.09 [Kbytes/sec] receivedConnection Times (ms)              min  mean[+/-sd] median   maxConnect:        0    8   4.8      7      24Processing:    26  144  13.1    143     195Waiting:        2  132  13.1    130     181Total:         26  152  12.6    150     203Percentage of the requests served within a certain time (ms)  50%    150  66%    155  75%    158  80%    160  90%    166  95%    171  98%    195  99%    199 100%    203 (longest request)

For Flask:

Concurrency Level:      500Time taken for tests:   27.827 secondsComplete requests:      5000Failed requests:        0Total transferred:      830000 bytesHTML transferred:       105000 bytesRequests per second:    179.68 [#/sec] (mean)Time per request:       2782.653 [ms] (mean)Time per request:       5.565 [ms] (mean, across all concurrent requests)Transfer rate:          29.13 [Kbytes/sec] receivedConnection Times (ms)              min  mean[+/-sd] median   maxConnect:        0   87 293.2      0    3047Processing:    14 1140 4131.5    136   26794Waiting:        1 1140 4131.5    135   26794Total:         14 1227 4359.9    136   27819Percentage of the requests served within a certain time (ms)  50%    136  66%    148  75%    179  80%    198  90%    295  95%   7839  98%  14518  99%  27765 100%  27819 (longest request)

Total results

Flask: Time taken for tests: 27.827 seconds

FastAPI - Uvicorn: Time taken for tests: 1.562 seconds

FastAPI - Uvicorn Workers: Time taken for tests: 0.577 seconds


With Uvicorn Workers FastAPI is nearly 48x faster than Flask, which is very understandable. ASGI vs WSGI, so i ran with 1 concurreny:

FastAPI - UvicornWorkers: Time taken for tests: 1.615 seconds

FastAPI - Pure Uvicorn: Time taken for tests: 2.681 seconds

Flask: Time taken for tests: 5.541 seconds

I ran more tests to test out Flask with a production server.

5000 Request 1000 Concurrency

Flask with Waitress

Server Software:        waitressServer Hostname:        127.0.0.1Server Port:            8000Document Path:          /Document Length:        21 bytesConcurrency Level:      1000Time taken for tests:   3.403 secondsComplete requests:      5000Failed requests:        0Total transferred:      830000 bytesHTML transferred:       105000 bytesRequests per second:    1469.47 [#/sec] (mean)Time per request:       680.516 [ms] (mean)Time per request:       0.681 [ms] (mean, across all concurrent requests)Transfer rate:          238.22 [Kbytes/sec] receivedConnection Times (ms)              min  mean[+/-sd] median   maxConnect:        0    4   8.6      0      30Processing:    31  607 156.3    659     754Waiting:        1  607 156.3    658     753Total:         31  611 148.4    660     754Percentage of the requests served within a certain time (ms)  50%    660  66%    678  75%    685  80%    691  90%    702  95%    728  98%    743  99%    750 100%    754 (longest request)

Gunicorn with Uvicorn Workers

Server Software:        uvicornServer Hostname:        127.0.0.1Server Port:            8000Document Path:          /Document Length:        19 bytesConcurrency Level:      1000Time taken for tests:   0.634 secondsComplete requests:      5000Failed requests:        0Total transferred:      720000 bytesHTML transferred:       95000 bytesRequests per second:    7891.28 [#/sec] (mean)Time per request:       126.722 [ms] (mean)Time per request:       0.127 [ms] (mean, across all concurrent requests)Transfer rate:          1109.71 [Kbytes/sec] receivedConnection Times (ms)              min  mean[+/-sd] median   maxConnect:        0   28  13.8     30      62Processing:    18   89  35.6     86     203Waiting:        1   75  33.3     70     171Total:         20  118  34.4    116     243Percentage of the requests served within a certain time (ms)  50%    116  66%    126  75%    133  80%    137  90%    161  95%    189  98%    217  99%    230 100%    243 (longest request)

Pure Uvicorn, but this time 4 workers uvicorn fastapi:app --workers 4

Server Software:        uvicornServer Hostname:        127.0.0.1Server Port:            8000Document Path:          /Document Length:        19 bytesConcurrency Level:      1000Time taken for tests:   1.147 secondsComplete requests:      5000Failed requests:        0Total transferred:      720000 bytesHTML transferred:       95000 bytesRequests per second:    4359.68 [#/sec] (mean)Time per request:       229.375 [ms] (mean)Time per request:       0.229 [ms] (mean, across all concurrent requests)Transfer rate:          613.08 [Kbytes/sec] receivedConnection Times (ms)              min  mean[+/-sd] median   maxConnect:        0   20  16.3     17      70Processing:    17  190  96.8    171     501Waiting:        3  173  93.0    151     448Total:         51  210  96.4    184     533Percentage of the requests served within a certain time (ms)  50%    184  66%    209  75%    241  80%    260  90%    324  95%    476  98%    504  99%    514 100%    533 (longest request)


You are using the time.sleep() function, in a async endpoint. time.sleep() is blocking and should never be used in asynchronous code. What you should be using is probably the asyncio.sleep() function:

import asyncioimport uvicornfrom fastapi import FastAPIapp = FastAPI()@app.get('/')async def root():    print('Sleeping for 10')    await asyncio.sleep(10)    print('Awake')    return {'message': 'hello'}if __name__ == "__main__":    uvicorn.run(app, host="127.0.0.1", port=8000)

That way, each request will take ~10 sec to complete, but you will be able to server multiple requests concurrently.

In general, async frameworks offer replacements for all blocking functions inside the standard library (sleep functions, IO functions, etc.). You are meant to use those replacements when writing async code and (optionally) await them.

Some non-blocking frameworks and libraries such as gevent, do not offer replacements. They instead monkey-patch functions in the standard library to make them non-blocking. This is not the case, as far as I know, for the newer async frameworks and libraries though, because they are meant to allow the developer to use the async-await syntax.


I think you are blocking an event queue in FastAPI which is asynchronous framework whereas in Flask requests are probably run each in new thread. Move all CPU bound tasks to separate processes or in your FastAPI example just sleep on event loop (do not use time.sleep here). In FastAPI run IO bound tasks asynchronously