Eric Schrock wrote:

On May 19, 2009, at 12:57 PM, Dave wrote:

If you don't have mirrored slogs and the slog fails, you may lose any data that was in a txg group waiting to be committed to the main pool vdevs - you will never know if you lost any data or not.

None of the above is correct. First off, you only lose data if the slog fails *and* the machine panics/reboots before the transaction groups is synced (5-30s by default depending on load, though there is a CR filed to immediately sync on slog failure). You will not lose any data once the txg is synced - syncing the transaction group does not require reading from the slog, so failure of the log device does not impact normal operation.


Thanks for correcting my statement. There is still a potential approximate 60 second window for data loss if there are 2 transaction groups waiting to sync with a 30 second txg commit timer, correct?

The latter half of the above statement is also incorrect. Should you find yourself in the double-failure described above, you will get an FMA fault that describes the nature of the problem and the implications. If the slog is truly dead, you can 'zpool clear' (or 'fmadm repair') the fault and use whatever data you still have in the pool. If the slog is just missing, you can insert it and continue without losing data. In no cases will ZFS silently continue without committed data.


How will it know that data was actually lost? Or does it just alert you that it's possible data was lost?

There's also the worry that the pool is not importable if you did have the double failure scenario and the log really is gone. Re: bug ID 6733267 . E.g. if you had done a 'zpool import -o cachefile=none mypool'.

--
Dave
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to