Richard Elling wrote:
> Adrian Saul wrote:
>> Howdy, I have at several times had issues with consumer grade PC
>> hardware and ZFS not getting along.  The problem is not the disks
>> but the fact I dont have ECC and end to end checking on the
>> datapath.  What is happening is that random memory errors and bit
>> flips are written out to disk and when read back again ZFS reports
>> it as a checksum failure:
>> 
>> pool: myth state: ONLINE status: One or more devices has
>> experienced an error resulting in data corruption.  Applications
>> may be affected. action: Restore the file in question if possible.
>> Otherwise restore the entire pool from backup. see:
>> http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config:
>> 
>> NAME        STATE     READ WRITE CKSUM myth        ONLINE       0
>> 0    48 raidz1    ONLINE       0     0    48 c7t1d0  ONLINE       0
>> 0     0 c7t3d0  ONLINE       0     0     0 c6t1d0  ONLINE       0
>> 0     0 c6t2d0  ONLINE       0     0     0
>> 
>> errors: Permanent errors have been detected in the following files:
>> 
>> 
>> /myth/tv/1504_20080216203700.mpg /myth/tv/1509_20080217192700.mpg
>> 
>> Note there are no disk errors, just entire RAID errors.  I get the
>> same thing on a mirror pool where both sides of the mirror have
>> identical errors.  All I can assume is that it was corrupted after
>> the checksum was calculated and flushed to disk like that.  In the
>> past it was a motherboard capacitor that had popped - but it was
>> enough to generate these errors under load.

I got a similar CKSUM error recently in which a block from a different
file ended up in one of my files.  So this was not a simple bit-flip,
but 64K of the file was bad.  However, I do not think any disk
filesystem should tolerate even bit flips.  Even in video files, I'd
want to know that

I hacked the ZFS source to temporarily ignore the error so I could see
what was wrong.  So your error(s) might be something of this kind
(except I do not understand, if so, how both of your mirrors were
affected in the same way - do you know this, or did ZFS simply say that
the file was not recoverable - i.e. it might have had different bad bits
in the two mirrors?).

For me, at least on subsequent reboots, no read or write errors were
reported on mine either, just CKSUM (I do seem to recall other errors
listed - read or write - but they were cleared on reboot, so I cannot
recall it exactly).  And I would think it's possible to get no errors if
it's simply a misdirected block write.  Still, I would then wonder why I
didn't see *2* files with errors if this is what happened to me.  I
guess I am saying that this may not be a memory glitch, but could also
be some IDE cable issue (as mine turned out to be).  See my post here:

http://lists.freebsd.org/pipermail/freebsd-stable/2008-February/040355.html

>> At any rate ZFS is doing the right thing by telling me - what I
>> dont like is that from that point on I cant convince ZFS to ignore
>> it.  The data in question is video files - a bit flip here or there
>> wont matter.  But if ZFS reads the affected block it returns and
>> I/O error and until I restore the file I have no option but to try
>> and make the application skip over it.  If it was UFS for example I
>> would have never known, but ZFS makes a point of stopping anything
>> using it - understandably, but annoyingly as well.

I understand your situation, and I agree that user-control might be nice
(in my case, I would not have had to tweak the ZFS code).  I do think
that zpool status should still reveal the error, however, even if the
file read does not report it (if you have set ZFS to ignore the error).
 I can also imagine this could be a bit dangerous if, e.g., the user
forgets this option is set.

>> PS: And yes, I am now buying some ECC memory.

Good practice in general - I always use ECC.  There is nothing worse
than silent data corruption.

> I don't recall when this arrived in NV, but the failmode parameter
> for storage pools has already been implemented.  From zpool(1m)
>      failmode=wait | continue | panic
> 
>          Controls the system behavior  in  the  event  of  catas-
>          trophic  pool  failure.  This  condition  is typically a
>          result of a  loss  of  connectivity  to  the  underlying
>          storage device(s) or a failure of all devices within the
>          pool. The behavior of such an  event  is  determined  as
>          follows:
> 
>          wait        Blocks all I/O access until the device  con-
>                      nectivity  is  recovered  and the errors are
>                      cleared. This is the default behavior.
> 
>          continue    Returns EIO to any new  write  I/O  requests
>                      but  allows  reads  to  any of the remaining
>                      healthy devices.  Any  write  requests  that
>                      have  yet  to  be committed to disk would be
>                      blocked.
> 
>          panic       Prints out a message to the console and gen-
>                      erates a system crash dump.

Is "wait" the default behavior now?  When I had CKSUM errors, reading
the file would return EIO and stop reading at that point (returning only
the good data so far).  Do you mean it blocks access on the errored
file, or on the whole device?  I've noticed the former, but not the latter.

Also, I'm not sure I understand "continue".  This also seems more severe
than current behavior, in which access to any files other than the
one(s) with errors still work.

                                        -Joe
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to