Thanks, Al.  I've been tracking ZFS (and playing with it for almost 4
months now), so I'm very much interested in phasing out our UFS and VxFS
systems for it, though obviously not immediately.

Let me give a more concrete example:

We have an archive server, where old builds go to die. I mean, rest.
Yeah. :-)

So, I'll be using this:

3 x Sun 3511FC SATA JBODs, each with 12 400GB disks. The setup is a
11-wide stripe of 3-disk RAIDZ vdevs, with the remain 3 drives as
hot-spares.  They'll attach to x4200 Opterons or V440 Sparcs as the
fileservers (in each case, they have a minimum of 2 Gigabit ethernet
interfaces bonded for network bandwidth). 

The files on the system will be source code build files and resulting
binaries (lots of .c, .h and .o files), so it's filled with 75% files of
10-100k, 20% files of 100-500k, and 5% of 1MB - 10MB in size.

The usage patten on these is this:

small random reads (50%) -  read the directory metadata, then possibly
several small files. a few 10s of MB, at most.

large read (25%) - read a large section of directory tree, including all
files inside (i.e. copy a whole directory tree from the disk - say a
previous entire build of the product).  200MB-2GB total.

large write (25%) - write an entire new directory tree with dozens of
directories and several 1000 files (e.g. a new build of the product) of
200MB to 2GB or so.




How does ZFS's filesystem cache impact write performance?  And how well
does it does the re-ordering do for turning large number of small reads
into larger sequential reads (that is, does lots of added RAM help this
significantly, or is there an upper bound)?

-Erik




On Fri, 2006-05-12 at 15:14 -0500, Al Hopper wrote:
> On Fri, 12 May 2006, Erik Trimble wrote:
> 
> > I'm looking at using ZFS as our main file server FS over here.
> >
> > I can do the disk layout tuning myself, but what I'm more interested in
> > is getting thoughts on the amount of RAM that might help performance on
> > these machines.  Assume I've got more than enough network and disk
> > bandwidth, and the disks are JBODs, so there is no NVRAM or anything
> > else on the arrays.
> >
> > The machines will be doing NFS and AFS filesharing ONLY, so this
> > discussion is relevant to ZFS as a fileserver, with no other
> > considerations.
> >
> >
> > Now, there are three usage patterns (I'm interested in tuning for each
> > scenario, as they probably will be separate machines):
> >
> >
> > (A)  random small read (80%)/ small write (20%)  - files in the sub 1MB
> > size, usually in the 50-200kB size
> >
> > (B)  sequential small read (80%) / small write (20%) - e.g. copy the
> > entire contents of directories around.  Files in mid-100k range, copying
> > 10s of MB total at a time.
> >
> > (C)  random read/write inside a single file (e.g. database store) - file
> > is under 10GB or so.
> >
> >
> > I'm assuming that ZFS read-ahead benefits greatly by more RAM for (A),
> > but I'm less sure about the other two cases, and I'm not sure at all
> > about how more RAM helps write performance (if at all).
> 
> Eric, I'll give you a generic answer to your generic question - given that
> there is a lack of specifics on both the question and answer side of the
> equation!  IMHO you absolutely need to be running the target platform(s)
> in 64-bit mode.  There will ensure that there is sufficient kernel memory
> available for ZFS in the Update 2 release.
> 
> Secondly, my recommendation for the "sweet spot", in terms of system
> memory is 4Gb - or more.  With 4Gb, ZFS performance is very, very nice.
> This personal opinion is from my hands-on eval/testing.
> 
> > Ideas?  Point me to docs?
> 
> This answer would fall under the "ideas" category! :)
> 
> In terms of docs, the only other point of reference is that RAIDZ systems
> should be composed of a single digit number of disk drives, between 3
> (minimum) and 9 (maximum).  So choose the disk drive size to provide the
> memory pool size (IOW capacity) you need for your application user
> community.  Or form multiple pools to satisfy your storage requirements.
> 
> Remember that this is the first production release of zfs.  While I'm very
> confident of the world class capabilities/talent of the zfs development
> team ... you can't compare zfs, in terms of absolute reliability, with an
> alternative filesystem (like UFS) that has millions & millions of hours of
> accumulated usage across a very broad set of application problem domains.
> So plan for the occasional glitch/bug.  IOW - don't put all your
> filesystem "eggs" in one "basket" - and don't operate without a "safety
> net".
> 
> I would whole-heartly recommond zfs for your production environment.  The
> upside is so much ahead of any perceived downsides - and the performance,
> ease of use, capabilities etc as so far ahead of UFS that it'll simply
> blow you away.
> 
> Al Hopper  Logical Approach Inc, Plano, TX.  [EMAIL PROTECTED]
>            Voice: 972.379.2133 Fax: 972.379.2134  Timezone: US CDT
> OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005
-- 
Erik Trimble
Java System Support
Mailstop:  usca14-102
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to