On Wed, Mar 21, 2012 at 7:56 AM, Jim Klimov <jimkli...@cos.ru> wrote:
> 2012-03-21 7:16, MLR wrote:
> One thing to note is that many people would not recommend using
> a "disbalanced" ZFS array - one expanded by adding a TLVDEV after
> many writes, or one consisting of differently-sized TLVDEVs.
> ZFS does a rather good job of trying to use available storage
> most efficiently, but it was often reported that it hits some
> algorithmic bottleneck when one of the TLVDEVs is about 80-90%
> full (even if others are new and empty). Blocks are balanced
> across TLVDEVs on write, so your old data is not magically
> redistributed until you explicitly rewrite it (i.e. zfs send
> or rsync into another dataset on this pool).
I have been running ZFS in a mission critical application since
zpool version 10 and have not seen any issues with some of the vdevs
in a zpool full while others are virtually empty. We have been running
commercial Solaris 10 releases. The configuration was that each
business unit had a separate zpool consisting of mirrored pairs of 500
GB LUNs from SAN based storage. Each zpool started with enough storage
for that business unit. As each business unit filled their space, we
added additional mirrored pairs of LUNs. So the smallest unit had one
mirror vdev and the largest had 13 vdevs. In the case of the two
largest (13 and 11 vdevs) most of the vdevs were well above 90%
utilized and there were 2 or 3 almost empty vdevs. We never saw any
reliability issues with this condition. In terms of performance, the
storage was NOT our performance bottleneck, so I do not know if there
were any performance issue with this situation.
> So I'd suggest that you keep your disks separate, with two
> pools made from 1.5Tb disks and from 3Tb disks, and use these
> pools for different tasks (i.e. a working set with relatively
> high turnaround and fragmentation, and WORM static data with
> little fragmentation and high read performance).
> Also this would allow you to more easily upgrade/replace the
> whole set of 1.5Tb disks when the time comes.
I have never tried mixing drives of different size or performance
characteristic in the same zpool or vdev, except as a temporary
migration strategy. You already know that growing a RAIDz vdev is
currently impossible, so with a RAIDz strategy your only option for
growth is to add complete RAIDz vdevs, and you _want_ those to match
in terms of performance or you will have unpredictable performance.
For situations where you _might_ want to grow the data capacity in the
future I recommend mirrors, but ... and Richard Elling posted hard
data on this to the list a while back, to get the reliability of
RAIDz2 you need more than a 2-way mirror. In my mind, the larger the
amount of data (and size of drives) the _more_ reliability you need.
We are no longer using the configuration described above. The
current configuration is five JBOD chassis of 24 drives each. We have
22 vdevs, each a RAIDz2 consisting of one drive from each chassis and
10 hot spares. Our priority was reliability followed by capacity and
performance. If we could have, we would have just used 3 or 4 way
mirrors, but we needed more capacity than that provided. I note that
in pre-production testing we did have two of the five JBOD chassis go
offline at once and did not lose _any_ data. The total pool size is
about 40 TB.
We also have a redundant copy of the data on a remote system. That
system only has two JBOD chassis and capacity is the priority. The
zpool consists of two vdevs each a RAIDz2 of 23 drives and two hot
spares. The performance is dreadful, but we _have_ the data in case of
a real disaster.
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
-> Technical Advisor, Troy Civic Theatre Company
-> Technical Advisor, RPI Players
zfs-discuss mailing list