Jeff Bonwick <[EMAIL PROTECTED]> writes: > > Since transactions in ZFS are committed until the ueberblock is written, > > this boils down to: > > > > "How is the ueberblock committed atomically in a RAID-Z configuration?" > > RAID-Z isn't even necessary to have this issue; all you need is a disk > that doesn't actually guarantee atomicity of single-sector writes. > Which, of course, we have to cope with. > > The key is that there's actually a ring of 128 uberblocks, indexed > by transaction group number (mod 128). When we open a storage pool, > we read every uberblock; among those that have a valid SHA-256 > checksum, we take the one with the highest transaction group (txg). > That will be, by definition, the uberblock for the last txg that > successfully committed to disk. > > If we lose power in the middle of writing an uberblock, then that > uberblock won't checksum, so we'll use the one from the previous txg, > i.e. the last one that synced completely.
I *love* information! I'm not a DB or filesystem engineer, so much of this is not stuff I work with every day; and I've always wondered how people got around issues like power failures and other uncontrollable shutdowns really reliably and cleanly. I think this is a way I haven't read about before, and it makes perfect sense and seems fairly cheap (you only have to look at all 128 on startup). -- David Dyer-Bennet, <mailto:[EMAIL PROTECTED]>, <http://www.dd-b.net/dd-b/> RKBA: <http://www.dd-b.net/carry/> Pics: <http://dd-b.lighthunters.net/> <http://www.dd-b.net/dd-b/SnapshotAlbum/> Dragaera/Steven Brust: <http://dragaera.info/> _______________________________________________ zfs-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
