Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better

Bob Friesenhahn Thu, 28 Aug 2008 09:34:28 -0700

On Thu, 28 Aug 2008, Ross wrote:
>
> I believe ZFS should apply the same tough standards to pool 
> availability as it does to data integrity.  A bad checksum makes ZFS 
> read the data from elsewhere, why shouldn't a timeout do the same 
> thing?


A problem is that for some devices, a five minute timeout is ok.  For 
others, there must be a problem if the device does not respond in a 
second or two.

If the system or device is simply overwelmed with work, then you would 
not want the system to go haywire and make the problems much worse.

Which of these do you prefer?

   o System waits substantial time for devices to (possibly) recover in
     order to ensure that subsequently written data has the least
     chance of being lost.

   o System immediately ignores slow devices and switches to
     non-redundant non-fail-safe non-fault-tolerant may-lose-your-data
     mode.  When system is under intense load, it automatically
     switches to the may-lose-your-data mode.

Bob
======================================
Bob Friesenhahn
[EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better

Reply via email to