Eric Schrock wrote:
On May 19, 2009, at 12:57 PM, Dave wrote:
If you don't have mirrored slogs and the slog fails, you may lose any
data that was in a txg group waiting to be committed to the main pool
vdevs - you will never know if you lost any data or not.
None of the above is correct. First off, you only lose data if the slog
fails *and* the machine panics/reboots before the transaction groups is
synced (5-30s by default depending on load, though there is a CR filed
to immediately sync on slog failure). You will not lose any data once
the txg is synced - syncing the transaction group does not require
reading from the slog, so failure of the log device does not impact
normal operation.
Thanks for correcting my statement. There is still a potential
approximate 60 second window for data loss if there are 2 transaction
groups waiting to sync with a 30 second txg commit timer, correct?
The latter half of the above statement is also incorrect. Should you
find yourself in the double-failure described above, you will get an FMA
fault that describes the nature of the problem and the implications. If
the slog is truly dead, you can 'zpool clear' (or 'fmadm repair') the
fault and use whatever data you still have in the pool. If the slog is
just missing, you can insert it and continue without losing data. In no
cases will ZFS silently continue without committed data.
How will it know that data was actually lost? Or does it just alert you
that it's possible data was lost?
There's also the worry that the pool is not importable if you did have
the double failure scenario and the log really is gone. Re: bug ID
6733267 . E.g. if you had done a 'zpool import -o cachefile=none mypool'.
--
Dave
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss