how to promote master, after failover on postgresql with docker

database postgresql docker docker-compose high-availability

When the Master fails the PostgreSQL cluster elects another master from the stanby nodes (based on the node weight in the cluster). So when the ex-master is finally brought back to life the cluster remains loyal to it's current master ,however the ex-master is initiated back in the cluster but this time as a standby. All of that is completely managed by the PostgreSQL and not the Pgpool.

so what you would expect is that if the new master (ex-standby) fails (or scaled to 0) then the cluster would failover to the ex-master and elect it as a leader once again and when the standby is scaled up again it would join as a standby and things are back to normal. And that is exactly what the PostgreSQL cluster would do.

But most probably the Pgpool service would fail at that moment because whenever a node fails the Pgpool sign that node status as DOWN and even if that node comes back to live it will not notify the pgpool and your traffic would not reach that node.

so if you checked the recovered node status - after its recovery - on the pgpool container using the PCP commands :

pcp_node_info -U pcp_user -h localhost -n 1 # master node idpgmaster 5432 down 1

so what you have to do is to re-attach the once fallen node back to the Pgpool manually using:

pcp_attach_node -U pcp_user -h localhost -n 1 # master node id--- executed successfully ---pcp_node_info -U pcp_user -h localhost -n 1 # master node idpgmaster 5432 up 1

At this point the pgpool recognizes the ex-master node once again and can direct traffic to it.

After that whenever you remove ( scale to 0 ) the ex-standby (now master) service the whole solution (PostgreSQL - Pgpool) would failover to the actual master and now you could bring standby up again and re-attach it to the pgpool.

P.S. The downtime is only the failover downtime of the pgpool and the pgpool service will maintain its original configuration nothing added nothing restarted (well except for the PostgresQL node that failed hopefully :D ).

database postgresql docker docker-compose high-availability

So i figure out how to sort of solve the problem,

Create the containers manually
Master docker run \ -e INITIAL_NODE_TYPE='master' \ -e NODE_ID=1 \ -e NODE_NAME='node1' \ -e CLUSTER_NODE_NETWORK_NAME='pgmaster' \ -e POSTGRES_PASSWORD='monkey_pass' \ -e POSTGRES_USER='monkey_user' \ -e POSTGRES_DB='monkey_db' \ -e CLUSTER_NODE_REGISTER_DELAY=5 \ -e REPLICATION_DAEMON_START_DELAY=120 \ -e CLUSTER_NAME='pg_cluster' \ -e REPLICATION_DB='replication_db' \ -e REPLICATION_USER='replication_user' \ -e REPLICATION_PASSWORD='replication_pass' \ -v cluster-archives:/var/cluster_archive \ -p 5432:5432 \ --net mynet \ --net-alias pgmaster \ --name pgmastertest \paunin/postgresql-cluster-pgsql
Slave docker run \ -e INITIAL_NODE_TYPE='standby' \ -e NODE_ID=2 \ -e NODE_NAME='node2' \ -e REPLICATION_PRIMARY_HOST='pgmaster' \ -e CLUSTER_NODE_NETWORK_NAME='pgslave1' \ -e REPLICATION_UPSTREAM_NODE_ID=1 \ -v cluster-archives:/var/cluster_archive \ -p 5441:5432 \ --net mynet \ --net-alias pgslave1 \ --name pgslavetest \paunin/postgresql-cluster-pgsql
Pgpool
docker run \ -e PCP_USER='pcp_user' \ -e PCP_PASSWORD='pcp_pass' \ -e PGPOOL_START_DELAY=120 \ -e REPLICATION_USER='replication_user' \ -e REPLICATION_PASSWORD='replication_pass' \ -e SEARCH_PRIMARY_NODE_TIMEOUT=5 \ -e DB_USERS='monkey_user:monkey_pass' \ -e BACKENDS='0:pgmaster:5432:1:/var/lib/postgresql/data:ALLOW_TO_FAILOVER,1:pgslave1::::' \ -p 5430:5432 \ -p 9898:9898 \ --net mynet \ --net-alias pgpool \ --name pgpooltest \paunin/postgresql-cluster-pgpool

on the line BACKENDS='0:pgmaster:5432:1:/var/lib/postgresql/data:ALLOW_TO_FAILOVER,1:pgslave1::::' \ you can add more slaves to pgppool

Stop master pgmaster, slave pgslave1 would be promoted after a few secs,
Add new slave container docker run \ -e INITIAL_NODE_TYPE='standby' \ -e NODE_ID=3 \ -e NODE_NAME='node1' \ -e REPLICATION_PRIMARY_HOST='pgslave1' \ -e CLUSTER_NODE_NETWORK_NAME='pgmaster' \ -e REPLICATION_UPSTREAM_NODE_ID=2 \ -v cluster-archives:/var/cluster_archive \ -p 5432:5432 \ --net mynet \ --net-alias pgmaster \ --name pgmastertest3 \paunin/postgresql-cluster-pgsql

on the following lines -e REPLICATION_PRIMARY_HOST='pgslave1' \ make sure you are pointing to the alias of the new master (pgslave1). -e REPLICATION_UPSTREAM_NODE_ID=2 \ make sure you are pointing to the new master node id (2). -e NODE_ID=3 \ make sure this id doesn't exists on the table repl_nodes.--net-alias pgmaster \ u can use the one from your old master, or use one that you already added on pgpool BACKENDS='0:pgmaster:5432:1:/var/lib/postgresql/data:ALLOW_TO_FAILOVER,1:pgslave1::::' \ otherwise if the new master fails repmgr wont be able to recover it.

Its a little manual, but it does what i need, and thats to add a new slave to the new master.

CodeHunter

how to promote master, after failover on postgresql with docker

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last