Re: [zfs-discuss] 350TB+ storage solution

Richard Elling Mon, 16 May 2011 08:35:09 -0700

On May 16, 2011, at 5:02 AM, Sandon Van Ness <san...@van-ness.com> wrote:


> On 05/15/2011 09:58 PM, Richard Elling wrote:
>>>  In one of my systems, I have 1TB mirrors, 70% full, which can be
>>> sequentially completely read/written in 2 hrs.  But the resilver took 12
>>> hours of idle time.  Supposing you had a 70% full pool of raidz3, 2TB disks,
>>> using 10 disks + 3 parity, and a usage pattern similar to mine, your
>>> resilver time would have been minimum 10 days,
>> bollix
>> 
>>> likely approaching 20 or 30
>>> days.  (Because you wouldn't get 2-3 weeks of consecutive idle time, and the
>>> random access time for a raidz approaches 2x the random access time of a
>>> mirror.)
>> totally untrue
>> 
>>> BTW, the reason I chose 10+3 disks above was just because it makes
>>> calculation easy.  It's easy to multiply by 10.  I'm not suggesting using
>>> that configuration.  You may notice that I don't recommend raidz for most
>>> situations.  I endorse mirrors because they minimize resilver time (and
>>> maximize performance in general).  Resilver time is a problem for ZFS, which
>>> they may fix someday.
>> Resilver time is not a significant problem with ZFS. Resilver time is a much
>> bigger problem with traditional RAID systems. In any case, it is bad systems
>> engineering to optimize a system for best resilver time.
>>  -- richard
> 
> Actually I have seen resilvers take a very long time (weeks) on 
> solaris/raidz2 when I almost never see a hardware raid controller take more 
> than a day or two. In one case i thrashed the disks absolutely as hard as I 
> could (hardware controller) and finally was able to get the rebuild to take 
> almost 1 week.. Here is an example of one right now:
> 
>   pool: raid3060
>   state: ONLINE
>   status: One or more devices is currently being resilvered. The pool will
>   continue to function, possibly in a degraded state.
>   action: Wait for the resilver to complete.
>   scrub: resilver in progress for 224h54m, 52.38% done, 204h30m to go
>   config:

I have seen worse cases, but the root cause was hardware failures
that are not reported by zpool status. Have you checked the health
of the disk transports? Hint: fmdump -e

Also, what zpool version is this? There were improvements made in the
prefetch and the introduction of throttles last year. One makes it faster,
the other intentionally slows it down.

As a rule of thumb, the resilvering disk is expected to max out at around
80 IOPS for 7,200 rpm disks. If you see less than 80 IOPS, then suspect
the throttles or broken data path.
 -- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] 350TB+ storage solution

Reply via email to