comment far below...
On Aug 21, 2009, at 6:17 PM, Tim Cook wrote:
On Fri, Aug 21, 2009 at 8:04 PM, Richard Elling <richard.ell...@gmail.com
> wrote:
On Aug 21, 2009, at 5:55 PM, Tim Cook wrote:
On Fri, Aug 21, 2009 at 7:41 PM, Richard Elling <richard.ell...@gmail.com
> wrote:
My vote is with Ross. KISS wins :-)
Disclaimer: I'm also a member of BAARF.
My point is, RAIDZx+1 SHOULD be simple. I don't entirely understand
why it hasn't been implemented. I can only imagine like so many
other things it's because there hasn't been significant customer
demand. Unfortunate if it's as simple as I believe it is to
implement. (No, don't ask me to do it, I put in my time programming
in college and have no desire to do it again :))
You can get in the same ballpark with at least two top-level raidz2
devs and
copies=2. If you have three or more top-level raidz2 vdevs, then
you can even
do better with copies=3 ;-)
Note that I do not have a model for that because it would require
separate
failure rate data for whole disk failures and all other non-whole
disk failures.
The latter is not available in data sheets. The closest I can get
with published
data is using the MTTDL[2] model which considers the published
unrecoverable
read error rate. In other words, the model would be easy, but data
to feed the
model is not available :-( Suffice to say, 2 top-level raidz2 vdevs
of similar size
with copies=2 should offer very nearly the same protection as
raidz2+1.
-- richard
You sure about that? Say I have a sas controller shit the bed
(pardon the french), and take one of the JBOD's out entirely. Even
with copies=2, isn't the entire pool going tits up and offline when
it loses an entire vdev?
Yes. But you need to understand that the probability of a SAS
controller failing is
much, much smaller than a disk. So in order to properly model the
system, you
can't treat them as having the same failure rate (the difference is an
order of
magnitude for HDDs). Depending on the repair policy, the probability
of losing a
SAS controller is expected to be less than the probability of losing 3
disks in a
raidz2. Since SAS is relatively easy to make redundant, a really
paranoid person
would have two SAS controllers and the probability of losing two
highly-reliable
SAS controllers at the same time is way small :-)
It would seem to me copies=2 is only applicable when you have both
an entire disk loss, and corrupt data on the "good disks". But feel
free to enlighten :) That scenario seems far less likely than
having a controller go bad, but that's with my anecdotal personal
experiences.
As the Kinks sing, "paranoia will destroy ya!" :-)
-- richard
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss