On Thu, May 3, 2012 at 3:35 PM, Edward Ned Harvey
<opensolarisisdeadlongliveopensola...@nedharvey.com> wrote:
>
> I think you'll get better, both performance & reliability, if you break each
> of those 15-disk raidz3's into three 5-disk raidz1's.  Here's why:

Incorrect on reliability; see below.

> Now, to put some numbers on this...
> A single 1T disk can sustain (let's assume) 1.0 Gbit/sec read/write
> sequential.  This means resilvering the entire disk sequentially, including
> unused space, (which is not what ZFS does) would require 2.2 hours.  In
> practice, on my 1T disks, which are in a mirrored configuration, I find
> resilvering takes 12 hours.  I would expect this to be ~4 days if I were
> using 5-disk raidz1, and I would expect it to be ~12 days if I were using
> 15-disk raidz3.

Based on your use of "I would expect", I'm guessing you haven't
done the actual measurement.

I see ~12-16 hour resilver times on pools using 1TB drives in
raidz configurations. The resilver times don't seem to vary
with whether I'm using raidz1 or raidz2.

> Suddenly the prospect of multiple failures overlapping don't seem so
> unlikely.

Which is *exactly* why you need multiple-parity solutions. Put
simply, if you're using single-parity redundancy with 1TB drives
or larger (raidz1 or 2-way mirroring) then you're putting your
data at risk. I'm seeing - at a very low level, but clearly non-zero -
occasional read errors during rebuild of raidz1 vdevs, leading to
data loss. Usually just one file, so it's not too bad (and zfs will tell
you which file has been lost). And the observed error rates we're
seeing in terms of uncorrectable (and undetectable) errors from
drives are actually slightly better than you would expect from the
manufacturers spec sheets.

So you definitely need raidz2 rather than raidz1; I'm looking at
going to raidz3 for solutions using current high capacity (ie 3TB)
drives.

(On performance, I know what the theory says about getting one
disk's worth of IOPS out of each vdev in a raidz configuration. In
practice we're finding that our raidz systems actually perform
pretty well when compared with dynamic stripes, mirrors, and
hardware raid LUNs.)

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to