On 10/4/2012 1:56 PM, Jim Klimov wrote:
What if the backup host is down (i.e. the ex-master after the failover)?
Will your failed-over pool accept no writes until both storage machines
What if internetworking between these two heads has a glitch, and as
a result both of them become masters of their private copies (mirror
halves), and perhaps both even manage to accept writes from clients?
This is the clustering part, which involves "fencing" around the node
which is considered dead, perhaps including a hardware reset request
just to make sure it's dead, before taking over resources it used to
master (STONITH - Shoot The Other Node In The Head). In particular,
clusters suggest that for hearbeats so as to make sure both machines
work indeed, you use at least two separate wires (i.e. serial and LAN)
without active hardware (switches) in-between, separate from data
this all makes a lot of sense. didn't mean to imply there are no
failure modes that can take you down entirely. i was aware of the
split-brain issue. i was not sure what richard was getting at...
zfs-discuss mailing list