How do I properly use connection pools in redis? How do I properly use connection pools in redis? python python

How do I properly use connection pools in redis?


Redis-py provides a connection pool for you from which you can retrieve a connection. Connection pools create a set of connections which you can use as needed (and when done - the connection is returned to the connection pool for further reuse). Trying to create connections on the fly without discarding them (i.e. not using a pool or not using the pool correctly) will leave you with way too many connections to redis (until you hit the connection limit).

You could choose to setup the connection pool in the init method and make the pool global (you can look at other options if uncomfortable with global).

redis_pool = Nonedef init():    global redis_pool    print("PID %d: initializing redis pool..." % os.getpid())    redis_pool = redis.ConnectionPool(host='10.0.0.1', port=6379, db=0)

You can then retrieve the connection from a pool like this:

redis_conn = redis.Redis(connection_pool=redis_pool)

Also, I am assuming you are using hiredis along with redis-py as it should improve performance in certain cases. Have you also checked the number of connections open to the redis server with your existing setup as it most likely is quite high? You can use the INFO commmand to get that information:

redis-cli info

Check for the Clients section in which you will see the "connected_clients" field that will tell you how many connections you have open to the redis server at that instant.


You shall use a singleton( borg pattern ) based wrapper written over redis-py, which will provide a common connection pool to all your files.Whenever you use an object of this wrapper class,it will use the same connection pool.

REDIS_SERVER_CONF = {    'servers' : {      'main_server': {        'HOST' : 'X.X.X.X',        'PORT' : 6379 ,        'DATABASE':0    }  }}import redisclass RedisWrapper(object):    shared_state = {}    def __init__(self):        self.__dict__ = self.shared_state    def redis_connect(self, server_key):        redis_server_conf = settings.REDIS_SERVER_CONF['servers'][server_key]        connection_pool = redis.ConnectionPool(host=redis_server_conf['HOST'], port=redis_server_conf['PORT'],                                               db=redis_server_conf['DATABASE'])        return redis.StrictRedis(connection_pool=connection_pool)

Usage:

r_server = RedisWrapper().redis_connect(server_key='main_server')r_server.ping()

UPDATE

In case your files run as different processes, you will have to use a redis proxy which will pool the connections for you, and instead of connecting to redis directly, you will have to use the proxy.A very stable redis ( and memcached ) proxy is twemproxy created by twitter, with main purpose being reduction in open connections.


Here's a quote right from the Cheese Shop page.

Behind the scenes, redis-py uses a connection pool to manage connections to a Redis server. By default, each Redis instance you create will in turn create its own connection pool. You can override this behavior and use an existing connection pool by passing an already created connection pool instance to the connection_pool argument of the Redis class. You may choose to do this in order to implement client side sharding or have finer grain control of how connections are managed.

pool = redis.ConnectionPool(host='localhost', port=6379, db=0)r = redis.Redis(connection_pool=pool)

Moreover, instances are thread-safe:

Redis client instances can safely be shared between threads. Internally, connection instances are only retrieved from the connection pool during command execution, and returned to the pool directly after. Command execution never modifies state on the client instance.

You say:

So each task file has its own redis instance (which presumably is very expensive). ... For our system, we have over a dozen task files following this same pattern, and I've noticed our requests slowing down.

It's quite unlikely that several dozens of connections can slow down Redis server. But because your code, behind the scenes, use connection pool, the problem is somewhere out of connections per se. Redis is in-memory storage, thus very fast in most imaginable cases. So I would rather look for the problem in the tasks.

Update

From comment of @user3813256. Yes, he uses connection pool at task level. The normal way to utilize built-in connection pool of redis package is just share the connection. In simplest way, your settings.py may look like this:

import redisconnection = Nonedef connect_to_redis():    global connection    connection = redis.StrictRedis(host='localhost', port=6379, db=0)

Then somewhere in bootstrapping of your application call connect_to_redis. Then use import connection in task modules.