...

> >> And how about FAULTS?
> >> hw/firmware/cable/controller/ram/...
> >
> > If you had read either the CERN study or what I
> already said about  
> > it, you would have realized that it included the
> effects of such  
> > faults.
> 
> 
> ...and ZFS is the only prophylactic available.

You don't *need* a prophylactic if you're not having sex:  the CERN study found 
*no* clear instances of faults that would occur in consumer systems and that 
could be attributed to the kinds of errors that ZFS can catch and more 
conventional file systems can't.  It found faults in the interaction of its 
add-on RAID controller (not a normal 'consumer' component) with its WD disks, 
it found single-bit errors that appeared to correlate with ECC RAM errors 
(i.e., likely occurred in RAM rather than at any point where ZFS would be 
involved), it found block-sized errors that appeared to correlate with 
misplaced virtual memory allocation (again, outside ZFS's sphere of influence).

> 
> 
> >
> > ...
> >
> >>>  but I had a box that was randomly
> >>>> corrupting blocks during
> >>>> DMA.  The errors showed up when doing a ZFS
> scrub
> >> and
> >>>> I caught the
> >>>> problem in time.
> >>>
> >>> Yup - that's exactly the kind of error that ZFS
> and
> >> WAFL do a
> >>> perhaps uniquely good job of catching.
> >>
> >> WAFL can't catch all: It's distantly isolated from
> >> the CPU end.
> >
> > WAFL will catch everything that ZFS catches,
> including the kind of  
> > DMA error described above:  it contains validating
> information  
> > outside the data blocks just as ZFS does.
> 
> Explain how it can do that, when it is isolated from
> the application  
> by several layers including the network?

Darrell covered one aspect of this (i.e., that ZFS couldn't either if it were 
being used in a server), but there's another as well:  as long as the NFS 
messages between client RAM and server RAM are checksummed in RAM on both ends, 
then that extends the checking all the way to client RAM (the same place where 
local ZFS checks end) save for any problems occurring *in* RAM at one end or 
the other (and ZFS can't deal with in-RAM problems either:  all it can do is 
protect the data until it gets to RAM).

- bill
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to