Nicolas Williams wrote: > On Thu, Aug 28, 2008 at 11:29:21AM -0500, Bob Friesenhahn wrote: > >> Which of these do you prefer? >> >> o System waits substantial time for devices to (possibly) recover in >> order to ensure that subsequently written data has the least >> chance of being lost. >> >> o System immediately ignores slow devices and switches to >> non-redundant non-fail-safe non-fault-tolerant may-lose-your-data >> mode. When system is under intense load, it automatically >> switches to the may-lose-your-data mode. >> > > Given how long a resilver might take, waiting some time for a device to > come back makes sense. Also, if a cable was taken out, or drive tray > powered off, then you'll see lots of drives timing out, and then the > better thing to do is to wait (heuristic: not enough spares to recover). > >
argv! I didn't even consider switches. Ethernet switches often use spanning-tree algorithms to converge on the topology. I'm not sure what SAN switches use. We have the following problem with highly available clusters which use switches in the interconnect: + Solaris Cluster interconnect timeout defaults to 10 seconds + STP can take > 30 seconds to converge So, if you use Ethernet switches in the interconnect, you need to disable STP on the ports used for interconnects or risk unnecessary cluster reconfigurations. Normally, this isn't a problem as the people who tend to build HA clusters also tend to read the docs which point this out. Still, a few slip through every few months. As usual, Solaris Cluster gets blamed, though it really is a systems engineering problem. Can we expect a similar attention to detail for ZFS implementers? I'm afraid not :-(. I'm not confident we can be successful with sub-minute reconfiguration, so the B_FAILFAST may be the best we could do for the general case. That isn't so bad, in fact we use failfasts rather extensively for Solaris Clusters, too. -- richard _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss