Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

Richard Elling Wed, 07 Apr 2010 10:50:54 -0700

On Apr 7, 2010, at 10:19 AM, Bob Friesenhahn wrote:
> On Wed, 7 Apr 2010, Edward Ned Harvey wrote:
>>> From: Ragnar Sundblad [mailto:ra...@csc.kth.se]
>>> 
>>> Rather: ... >=19 would be ... if you don't mind loosing data written
>>> the ~30 seconds before the crash, you don't have to mirror your log
>>> device.
>> 
>> If you have a system crash, *and* a failed log device at the same time, this
>> is an important consideration.  But if you have either a system crash, or a
>> failed log device, that don't happen at the same time, then your sync writes
>> are safe, right up to the nanosecond.  Using unmirrored nonvolatile log
>> device on zpool >= 19.
> 
> The point is that the slog is a write-only device and a device which fails 
> such that its acks each write, but fails to read the data that it "wrote", 
> could silently fail at any time during the normal operation of the system.  
> It is not necessary for the slog device to fail at the exact same time that 
> the system spontaneously reboots.  I don't know if Solaris implements a 
> background scrub of the slog as a normal course of operation which would 
> cause a device with this sort of failure to be exposed quickly.


You are playing against marginal returns. An ephemeral storage requirement
is very different than permanent storage requirement.  For permanent storage
services, scrubs work well -- you can have good assurance that if you read
the data once then you will likely be able to read the same data again with 
some probability based on the expected decay of the data. For ephemeral data,
you do not read the same data more than once, so there is no correlation
between reading once and reading again later.  In other words, testing the
readability of an ephemeral storage service is like a cat chasing its tail.  
IMHO,
this is particularly problematic for contemporary SSDs that implement wear 
leveling.

<sidebar>
For clusters the same sort of problem exists for path monitoring. If you think
about paths (networks, SANs, cups-n-strings) then there is no assurance 
that a failed transfer means all subsequent transfers will also fail. Some other
permanence test is required to predict future transfer failures.
s/fail/pass/g
</sidebar>

Bottom line: if you are more paranoid, mirror the separate log devices and
sleep through the night.  Pleasant dreams! :-)
 -- richard


ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com 

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Sun Flash Accelerator F20 numbers

Reply via email to