Understanding docker swarm in terms of high availability Understanding docker swarm in terms of high availability docker docker

Understanding docker swarm in terms of high availability


Nothing more complex than that really. Like it says, Swarm (and kubernetes, and most other tooling in this space) is declarative, which means that you tell it the state that you want (i.e. 'I want 4 instances of redis') and Swarm will converge the system to that state. If you have 3 nodes, then it will schedule 1 redis on Node 1, 1 on Node 2, and 2 on Node 3. If Node 2 dies, then the system is now not 'compliant' with your declared state, and Swarm will schedule another redis on Node 1 or 3 (depending on strategy, etc...).

Now this dynamism of container / task / instance scheduling brings another problem, discovery. Swarm deals with this by maintaining an internal DNS registry and by creating VIP (virtual IPs) for each service. Instead of having to address / keep track of each redis instance, I can instead point to a service alias and Swarm will automatically route traffic to where it needs to go.

Of course there are also other considerations:

  • Can your service support multiple backend instances? Is it stateless? Sessions? Cache? Etc...
  • What is 'HA'? Multi-node? Multi-AZ? Multi-region? Etc...