Markus Kovero wrote:
What is an example of where a checksummed outside pool would not be able
to protect a non-checksummed inside pool? Would an intermittent
RAM/motherboard/CPU failure that only corrupted the inner pool's block
before it was passed to the outer pool (and did not corrupt the outer
pool's block) be a valid example?
If checksums are desirable in this scenario, then redundancy would also
be needed to recover from checksum failures.
That is excellent point also, what is the point for checksumming if you cannot recover from it?
Checksum errors can tell you there is probably a problem worthy of
attention. They can prevent you from making things worse by stopping
you in your tracks until whatever triggered them is resolved, or enough
redundancy is available to overcome the errors. This is why operating
system kernels panic/abend/BSOD when they detect that the system state
has been changed in an unknown way which could have unpredictable (and
likely bad) results on further operations.
Redundancy is useful when you can't recover the data by simply asking
for it to be re-sent or by getting it from another source.
Communications buses and protocols will use checksums to detect
corruption and resends/retries to recover from checksum failures. That
strategy doesn't work when you are talking about your end storage media.
At this kind of configuration one would benefit performance-wise not having to
calculate checksums again.
Checksums in outer pools effectively protect from disk issues, if hardware
fails so data is corrupted isn't outer pools redundancy going to handle it for
inner pool also.
Only thing comes to mind is that IF something happens to outerpool, innerpool
is not aware anymore of possibly broken data which can lead issues.
Yours
Markus Kovero
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss