On 04/17/09 12:37, casper....@sun.com wrote:
I'd like to submit an RFE suggesting that data + checksum be copied for
mirrored writes, but I won't waste anyone's time doing so unless you
think there is a point. One might argue that a machine this flaky should
be retired, but it is actually working quite well, and perhaps represents
not even the extreme of bad hardware that ZFS might encounter.
I think it's a stupid idea. If you get two checksums, what can you do?
The second copy is most likely suspect and you double your chance that you
use bad memory.
Casper
If there were permanently bad memory locations, surely the diagnostics
would reveal them. Here's an interesting paper on memory errors:
http://www.ece.rochester.edu/~mihuang/PAPERS/hotdep07.pdf
Given the inevitability of relatively frequent transient memory
errors, I would think it behooves the file system to minimize the
effects of such errors. But I won't belabor the point except to
suggest that the cost of adding the suggested step would not be
very expensive (either to implement or run).
Memory diagnostics ran for a full 12 hours with no errors. Same goes
for both disks, using Solaris format/ana/verify. So far, after
creating 400,000 files, two files had permanent, apparently truly
unrecoverable errors and could not be read by anything.
Now it gets really funky. I detached one of the disks, and then found
it couldn't be reattached. Turns out there is a rounding problem with
Solaris fdisk (run from format) that can cause identical partitions on
identical disks to have different sizes. I used the Linux sfdisk
utility to repair the MBR and fix the Solaris2 partition sizes. Then
it was possible to reattach the disk. Unfortunately it wasn't possible
to boot from the result, but a reinstall went perfectly with no ZFS
errors being reported at all. So it appears that the problem may be
with the OpenSolaris fdisk. Is this worth reporting as a bug? It is
likely to be quite hard to reproduce...
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss