On Jul 31, 2006, at 2:17 PM, Jan Schaumann wrote:
Hello all,
After setting up a Solaris 10 machine with ZFS as the new NFS server,
I'm stumped by some serious performance problems. Here are the
(admittedly long) details (also noted at
http://www.netmeister.org/blog/):
The machine in question is a dual-amd64 box with 2GB RAM and two
broadcom gigabit NICs. The OS is Solaris 10 6/06 and the filesystem
consists of a single zpool stripe across the two halfs of an Apple
XRaid
(each half configured as RAID5), providing a pool of 5.4 TB.
Hello Jan, I'm in a very similar situation.
I have two Xserve RAIDs, half of the disks in each are a RAID5'd LUN
and those are presented to a Sun X4100 and mirrored in ZFS there.
This server is a NFS server for research-related storage and has
multiple NFS clients running either Linux 2.4.x or IRIX 6.5.x
These clients and their NFS server have dedicated gig-e between them
and performance is not exactly stellar.
For instance, here is an example of the 2nd command you gave when ran
on a Linux NFS client:
=====================
[EMAIL PROTECTED]/mnt$ time dd if=/dev/zero of=blah bs=1024k count=128
128+0 records in
128+0 records out
real 1m21.343s
user 0m0.000s
sys 0m2.480s
=====================
1m 21sec to write a 128MB file over a NFSv3 mount over a gig-e
network. The mount options for this linux client are:
nfsvers=3,rsize=32768,wsize=32768. No matter what rsize and wsize I
set, the time trial results are always in the vicinity of 1m20s.
Mounting with NFSv2 and running the same test is even worse. No, it's
horrendous and scary:
=====================
[EMAIL PROTECTED]/mnt$ time dd if=/dev/zero of=blah5 bs=1024k count=128128
+0 records in
128+0 records out
real 36m5.642s
user 0m0.000s
sys 0m2.370s
=====================
If I try that test on the NFS server itself in the same volume I NFS
mounted on the above Linux client, I get decent speed:
=====================
[EMAIL PROTECTED]/ds2-store/test/smbshare$ time dd if=/dev/zero of=blah2
bs=1024k count=128
128+0 records in
128+0 records out
real 0m0.214s
user 0m0.001s
sys 0m0.212s
=====================
So it seems ZFS itself is OK on top of these Apple Xserve RAIDS
(which are running the 1.5 firmware). ZFS volume compression is
turned on.
I replicated your #3 command and the Linux NFS client read the file
back in 2.4 seconds (this was after a umount and remount of the NFS
share).
So while reads from a NFS client seem ok (still, not great though),
and writing to the ZFS volume on the NFS server is also OK, writing
over NFS over the dedicated gig-e network is painfully slow. I, too,
see bursty traffic with the NFS writes.
I put a Solaris 10 host on this gig-e NFS-only network, the same one
that the Linux client is on and I mounted the same NFS share with the
equivalent options and got far, far, better results:
=====================
[EMAIL PROTECTED]/$ mount -o vers=3 ds2.rs:/ds2-store/test/smbshare /mnt
[EMAIL PROTECTED]/$ cd /mnt
[EMAIL PROTECTED]/mnt$ time dd if=/dev/zero of=blah bs=1024k count=128
128+0 records in
128+0 records out
real 0m13.349s
user 0m0.001s
sys 0m0.519s
=====================
... And a read of that file after a unmount/remount (to clear any
local cache):
=====================
[EMAIL PROTECTED]/$ umount /mnt
[EMAIL PROTECTED]/$ mount -o vers=3 ds2.rs:/ds2-store/test/smbshare /mnt
[EMAIL PROTECTED]/$ cd /mnt
[EMAIL PROTECTED]/mnt$ time dd if=blah of=/dev/null bs=1024k
128+0 records in
128+0 records out
real 0m11.481s
user 0m0.001s
sys 0m0.295s
=====================
Hmm. It took nearly as long to read the file as it did to write it.
Without a remount the file reads back in 0.24 seconds (to be expected
of course).
So what does this exercise leave me thinking? Is Linux 2.4.x really
screwed up in NFS-land? This Solaris NFS replaces a Linux-based NFS
server that the clients (linux and IRIX) liked just fine.
/dale
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss