On Nov 16, 2007 9:13 PM, Neil Perrin <[EMAIL PROTECTED]> wrote: > Joe, > > I don't think adding a slog helped in this case. In fact I > believe it made performance worse. Previously the ZIL would be > spread out over all devices but now all synchronous traffic > is directed at one device (and everything is synchronous in NFS). > Mind you 15MB/s seems a bit on the slow side - especially is > cache flushing is disabled. > > It would be interesting to see what all the threads are waiting > on. I think the problem maybe that everything is backed > up waiting to start a transaction because the txg train is > slow due to NFS requiring the ZIL to push everything synchronously. >
I agree completely. The log (even though slow) was an attempt to isolate writes away from the pool. I guess the question is how to provide for async access for NFS. We may have 16, 32 or whatever threads, but if a single writer keeps the ZIL pegged and prohibiting reads, its all for nought. Is there anyway to tune/configure the ZFS/NFS combination to balance reads/writes to not starve one for the other. Its either feast or famine or so tests have shown. > Neil. > > > Joe Little wrote: > > I have historically noticed that in ZFS, when ever there is a heavy > > writer to a pool via NFS, the reads can held back (basically paused). > > An example is a RAID10 pool of 6 disks, whereby a directory of files > > including some large 100+MB in size being written can cause other > > clients over NFS to pause for seconds (5-30 or so). This on B70 bits. > > I've gotten used to this behavior over NFS, but didn't see it perform > > as such when on the server itself doing similar actions. > > > > To improve upon the situation, I thought perhaps I could dedicate a > > log device outside the pool, in the hopes that while heavy writes went > > to the log device, reads would merrily be allowed to coexist from the > > pool itself. My test case isn't ideal per se, but I added a local 9GB > > SCSI (80) drive for a log, and added to LUNs for the pool itself. > > You'll see from the below that while the log device is pegged at > > 15MB/sec (sd5), my directory list request on devices sd15 and sd16 > > never are answered. I tried this with both no-cache-flush enabled and > > off, with negligible difference. Is there anyway to force a better > > balance of reads/writes during heavy writes? > > > > extended device statistics > > device r/s w/s kr/s kw/s wait actv svc_t %w %b > > fd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd5 0.0 118.0 0.0 15099.9 0.0 35.0 296.7 0 100 > > sd6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd11 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd12 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd13 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd14 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd15 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd16 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > ... > _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss