> Since transactions in ZFS are committed until the ueberblock is written, > this boils down to: > > "How is the ueberblock committed atomically in a RAID-Z configuration?"
RAID-Z isn't even necessary to have this issue; all you need is a disk that doesn't actually guarantee atomicity of single-sector writes. Which, of course, we have to cope with. The key is that there's actually a ring of 128 uberblocks, indexed by transaction group number (mod 128). When we open a storage pool, we read every uberblock; among those that have a valid SHA-256 checksum, we take the one with the highest transaction group (txg). That will be, by definition, the uberblock for the last txg that successfully committed to disk. If we lose power in the middle of writing an uberblock, then that uberblock won't checksum, so we'll use the one from the previous txg, i.e. the last one that synced completely. Jeff _______________________________________________ zfs-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
