On Nov 18, 2007 1:44 PM, Richard Elling <[EMAIL PROTECTED]> wrote: > one more thing... > > > Joe Little wrote: > > I have historically noticed that in ZFS, when ever there is a heavy > > writer to a pool via NFS, the reads can held back (basically paused). > > An example is a RAID10 pool of 6 disks, whereby a directory of files > > including some large 100+MB in size being written can cause other > > clients over NFS to pause for seconds (5-30 or so). This on B70 bits. > > I've gotten used to this behavior over NFS, but didn't see it perform > > as such when on the server itself doing similar actions. > > > > To improve upon the situation, I thought perhaps I could dedicate a > > log device outside the pool, in the hopes that while heavy writes went > > to the log device, reads would merrily be allowed to coexist from the > > pool itself. My test case isn't ideal per se, but I added a local 9GB > > SCSI (80) drive for a log, and added to LUNs for the pool itself. > > You'll see from the below that while the log device is pegged at > > 15MB/sec (sd5), my directory list request on devices sd15 and sd16 > > never are answered. I tried this with both no-cache-flush enabled and > > off, with negligible difference. Is there anyway to force a better > > balance of reads/writes during heavy writes? > > > > extended device statistics > > device r/s w/s kr/s kw/s wait actv svc_t %w %b > > fd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd5 0.0 118.0 0.0 15099.9 0.0 35.0 296.7 0 100 > > When you see actv = 35 and svc_t > ~20, then it is possible that > you can improve performance by reducing the zfs_vdev_max_pending > queue depth. See > http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Device_I.2FO_Queue_Size_.28I.2FO_Concurrency.29 > > This will be particularly true for JBODs. > > Doing a little math, there is ~ 4.5 MBytes queued in the drive > waiting to be written. 4.5 MBytes isn't much for a typical RAID > array, but for a disk, it is often a sizeable chunk of its > available cache. A 9 GByte disk, being rather old, has a pretty > wimpy microprocessor, so you are basically beating the poor thing > senseless. Reducing the queue depth will allow the disk to perform > more efficiently.
I'll be trying an 18G 10K drive tomorrow. Again the test was simply to see if by having a slog, I'd enable NFS to allow for concurrent reads and writes. Especially in the iscsi case, but even in jbod, I find _any_ heavy writing to completely postpone reads to NFS clients. This makes ZFS and NFS impractical under i/o duress. My just was to simply see how things work. It appears from Neil that it won't, and the synchronicity RFE per ZFS filesystem is what is needed, or at least zil_disable for NFS to be practically used currently. As for the max_pending, I did try to lower that w/o any success (for values of 10 and 20) in a JBOD. > -- richard > _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss