We're attempting to build a multi-site cluster:
1. web-tier and application tier is active in all sites
2. only one database is active at a time- normally in the designated
We want to use 3 sites to maintain a quorum. So, if the Primary site loses
sight of both of the other sites, it will close down itself down. If the
other sites both lose sight of the Primary site, they will co-operate in
electing one of the pair as the new primary, and bring up the database
I am thinking that in each site, a number of sentinel processes could hold
open ephemeral znodes flagging that the site is up - with names like
"site1/sentinel-1". These sentinels could be plugged into local health
monitoring, and when the site falls into dis-repair, remove themselves. If
links between sites fail, then the ephemeral nodes would disappear too.
Each site would have a process that periodically checks the presence of the
sentinel znodes of the other sites. If all disappear, then the site knows
it is in a minority partition, and shuts down services as required.
Is this a viable approach, or am I taking Zookeeper out of its application
domain and just asking for trouble ?