> I see the source for some confusion.  On the ZFS Best Practices page:
> http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide
> 
> It says:
> Failure of the log device may cause the storage pool to be inaccessible
> if
> you are running the Solaris Nevada release prior to build 96 and a
> release
> prior to the Solaris 10 10/09 release.
> 
> It also says:
> If a separate log device is not mirrored and the device that contains
> the
> log fails, storing log blocks reverts to the storage pool.

I have some more concrete data on this now.  Running Solaris 10u8 (which is
10/09), fully updated last weekend.  We want to explore the consequences of
adding or failing a non-mirrored log device.  We created a pool with a
non-mirrored ZIL log device.  And experimented with it:

(a)  Simply yank out the non-mirrored log device while the system is live.
The result was:  Any zfs or zpool command would hang permanently.  Even "zfs
list" hangs permanently.  The system cannot shutdown, cannot reboot, cannot
"zfs send" or "zfs snapshot" or anything ... It's a bad state.  You're
basically hosed.  Power cycle is the only option.

(b)  After power cycling, the system won't boot.  It gets part way through
the boot process, and eventually just hangs there, infinitely cycling error
messages about services that couldn't start.  Random services, such as
inetd, which seem unrelated to some random data pool that failed.  So we
power cycle again, and go into failsafe mode, to clean up and destroy the
old messed up pool ... Boot up totally clean again, and create a new totally
clean pool with a non-mirrored log device.  Just to ensure we really are
clean, we simply "zpool export" and "zpool import" with no trouble, and
reboot once for good measure.  "zfs list" and everything are all working
great...

(c)  Do a "zpool export."  Obviously, the ZIL log device is clean and
flushed at this point, not being used.  We simply yank out the log device,
and do "zpool import."  Well ... Without that log device, I forget the
terminology, it said something like "missing disk."  Plain and simple, you
*can* *not* import the pool without the log device.  It does not say "to
force use -f" and even if you specify the -f, it still just throws the same
error message, missing disk or whatever.  Won't import.  Period.

...  So, to anybody who said the failed log device will simply fail over to
blocks within the main pool:  Sorry.  That may be true in some later
version, but it is not the slightest bit true in the absolute latest solaris
(proper) available today.

I'm going to venture a guess this is no longer a problem, after zpool
version 19.  This is when "ZFS log device removal" was introduced.

Unfortunately, the latest version of solaris only goes up to zpool version
15.

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to