Celery Worker Database Connection Pooling Celery Worker Database Connection Pooling python python

Celery Worker Database Connection Pooling


I like tigeronk2's idea of one connection per worker. As he says, Celery maintains its own pool of workers so there really isn't a need for a separate database connection pool. The Celery Signal docs explain how to do custom initialization when a worker is created so I added the following code to my tasks.py and it seems to work exactly like you would expect. I was even able to close the connections when the workers are shutdown:

from celery.signals import worker_process_init, worker_process_shutdowndb_conn = None@worker_process_init.connectdef init_worker(**kwargs):    global db_conn    print('Initializing database connection for worker.')    db_conn = db.connect(DB_CONNECT_STRING)@worker_process_shutdown.connectdef shutdown_worker(**kwargs):    global db_conn    if db_conn:        print('Closing database connectionn for worker.')        db_conn.close()


Have one DB connection per worker process. Since celery itself maintains a pool of worker processes, your db connections will always be equal to the number of celery workers. Flip side, sort of, it will tie up db connection pooling to celery worker process management. But that should be fine given that GIL allows only one thread at a time in a process.


You can override the default behavior to have threaded workers instead of a worker per process in your celery config:

CELERYD_POOL = "celery.concurrency.threads.TaskPool"

Then you can store the shared pool instance on your task instance and reference it from each threaded task invocation.