I suspect this is a bug in raidz error reporting.  With a mirror,
each copy either checksums correctly or it doesn't, so we know
which drives gave us bad data.  With RAID-Z, we have to infer
which drives have damage.  If the number of drives returning bad
data is less than or equal to the number of parity drives, we can
both detect and correct the error.  But if, say, three drives in
a RAID-Z2 stripe return corrupt data, we have no way to know which
drives are at fault -- there's just not enough information, and I
mean that in the mathematical sense (fewer equations than unknowns).

That said, we should enhance 'zpool status' to indicate the number
of detected-but-undiagnosable errors on each RAID-Z vdev.

Jeff

Kevin wrote:
> We'll try running all of the diagnostic tests to rule out any other issues.
> 
> But my question is, wouldn't I need to see at least 3 checksum errors on the 
> individual devices in order for there to be a visible error in the top level 
> vdev? There doesn't appear to be enough raw checksum errors on the disks for 
> there to have been 3 errors in the same vdev block. Or am I not understanding 
> the checksum count correctly?
>  
>  
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to