On 10/26/2012 04:29 AM, Karl Wagner wrote:
>
> Does it not store a separate checksum for a parity block? If so, it
> should not even need to recalculate the parity: assuming checksums
> match for all data and parity blocks, the data is good.
>
> I could understand why it would not store a checksum for a parity
> block. It is not really necessary: Parity is only used to reconstruct
> a corrupted block, so you can reconstruct the block and verify the
> data checksum. But I can also see why they would: Simplified logic,
> faster identification of corrupt parity blocks (more usefull for
> RAIDZ2 and greater), and the general principal that all blocks are
> checksummed.
>
> If this was the case, it should mean that RAIDZ scub is faster than
> mirror scrub, which I don't think it is. So this post is probably
> redundant (pun intended)
>

Parity is very simple to calculate and doesn't use a lot of CPU - just
slightly more work than reading all the blocks: read all the stripe
blocks on all the drives involved in a stripe, then do a simple XOR
operation across all the data.  The actual checksums are more expensive
as they're MD5 - much nicer when these can be hardware accelerated.

Also, on x86, there are SSE block operations that make XORing for a
whole block a lot faster by doing a whole chunk at a time, so you don't
need a loop to do it - not sure which ZFS implementations take advantage
of these, but in the end XOR is not an expensive operation. MD5 is by
several orders of magnitude.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to