Re: [zfs-discuss] Split responsibility for data with ZFS

Richard Elling Fri, 12 Dec 2008 12:24:59 -0800

Gary Mills wrote:
> On Thu, Dec 11, 2008 at 10:41:26PM -0600, Bob Friesenhahn wrote:
>   
>> On Thu, 11 Dec 2008, Gary Mills wrote:
>>     
>>> The split responsibility model is quite appealing.  I'd like to see
>>> ZFS address this model.  Is there not a way that ZFS could delegate
>>> responsibility for both error detection and correction to the storage
>>> device, at least one more sophisticated than a physical disk?
>>>       
>> Why is split responsibility appealing?  In almost any complex system 
>> whether it be government or computing, split responsibility results in 
>> indecision and confusion.  Heirarchical decision making based on 
>> common rules is another matter entirely.
>>     
>
> Now this becomes semantics.  There still has to be a hierarchy, but
> it's split into areas of responsibility.  In the case of ZFS over SAN
> storage, the area boundary now is the SAN cable


I think I see where you are coming from.  Suppose we make an operational
definition that says a SAN is a transport for block-level data.  Then...

>> Unfortunately SAN equipment 
>> is still based on technology developed in the early '80s and simply 
>> tries to behave like a more reliable disk drive rather than a 
>> participating intelligent component in a system which may detect, 
>> tolerate, and spontaneously correct any faults.
>>     
>
> That's exactly what I'm asking.  How can ZFS and SAN equipment be
> improved so that they cooperate to make the whole system more
> reliable?  Converting the SAN storage into a JBOD is not a valid
> solution.
>
>   

ZFS only knows about block devices.  It really doesn't care if that block
device is an IDE disk, USB disk, or something on the SAN.  If you want
ZFS to be able to repair damage that it detects, then ZFS needs to manage
the data redundancy.  If you don't care that ZFS may not be able to repair
damage, then don't configure ZFS with redundancy. It really is that simple.
The stack looks something like:

    application
    ---- read(), write(), mmap(), etc. ----
    ZFS
    ---- read(), write(), ioctl(), etc. ----
    block device

Ideally, applications would manage their data integrity, but developers
tend to let file systems or block-level systems manage data integrity.

>> No matter how good your SAN is, it won't spot a flaky cable or bad RAM.
>>     
>
> Of course it will.  There's an error-checking protocol that runs over
> the SAN cable.  Memory will detect errors as well.  There's error
> checking, or checking and correction, every step of the way.  Better
> integration of all of this error checking could be an improvement,
> though.
>   
However, there are a number of failure modes which cannot be detected
by such things.  By implementing more end-to-end checking, you can
see when your SAN switch firmware stuffs nulls into your data stream
or your disk reads the data from the wrong sector (for example).  No
matter how much reliability is built into each step of the way, you must
trust the subsystem at each step, and anecdotally, there are many 
subsystems
which cannot be trusted: disks, arrays, switches, HBAs, memory, etc.
You will find similar end-to-end design elsewhere, particularly in the
security field.
 -- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Split responsibility for data with ZFS

Reply via email to