>RAID-Z isn't even necessary to have this issue; all you need is a disk >that doesn't actually guarantee atomicity of single-sector writes. >Which, of course, we have to cope with. > >The key is that there's actually a ring of 128 uberblocks, indexed >by transaction group number (mod 128). When we open a storage pool, >we read every uberblock; among those that have a valid SHA-256 >checksum, we take the one with the highest transaction group (txg). >That will be, by definition, the uberblock for the last txg that >successfully committed to disk. > >If we lose power in the middle of writing an uberblock, then that >uberblock won't checksum, so we'll use the one from the previous txg, >i.e. the last one that synced completely.
Thanks for providing this last bit of my mental ZFS picture. Does ZFS keep statistics on how many ueberblocks are bad when it imports a pool? Or is it the case that when fewer than 128 ueberblocks have ever been committed, the remainder will be bogus? Casper _______________________________________________ zfs-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
