On Nov 16, 2007 9:17 PM, Joe Little <[EMAIL PROTECTED]> wrote: > On Nov 16, 2007 9:13 PM, Neil Perrin <[EMAIL PROTECTED]> wrote: > > Joe, > > > > I don't think adding a slog helped in this case. In fact I > > believe it made performance worse. Previously the ZIL would be > > spread out over all devices but now all synchronous traffic > > is directed at one device (and everything is synchronous in NFS). > > Mind you 15MB/s seems a bit on the slow side - especially is > > cache flushing is disabled. > > > > It would be interesting to see what all the threads are waiting > > on. I think the problem maybe that everything is backed > > up waiting to start a transaction because the txg train is > > slow due to NFS requiring the ZIL to push everything synchronously. > >
Roch wrote this before (thus my interest in the log or NVRAM like solution): "There are 2 independant things at play here. a) NFS sync semantics conspire againts single thread performance with any backend filesystem. However NVRAM normally offers some releaf of the issue. b) ZFS sync semantics along with the Storage Software + imprecise protocol in between, conspire againts ZFS performance of some workloads on NVRAM backed storage. NFS being one of the affected workloads. The conjunction of the 2 causes worst than expected NFS perfomance over ZFS backend running __on NVRAM back storage__. If you are not considering NVRAM storage, then I know of no ZFS/NFS specific problems. Issue b) is being delt with, by both Solaris and Storage Vendors (we need a refined protocol); Issue a) is not related to ZFS and rather fundamental NFS issue. Maybe future NFS protocol will help. Net net; if one finds a way to 'disable cache flushing' on the storage side, then one reaches the state we'll be, out of the box, when b) is implemented by Solaris _and_ Storage vendor. At that point, ZFS becomes a fine NFS server not only on JBOD as it is today , both also on NVRAM backed storage. It's complex enough, I thougt it was worth repeating." > > I agree completely. The log (even though slow) was an attempt to > isolate writes away from the pool. I guess the question is how to > provide for async access for NFS. We may have 16, 32 or whatever > threads, but if a single writer keeps the ZIL pegged and prohibiting > reads, its all for nought. Is there anyway to tune/configure the > ZFS/NFS combination to balance reads/writes to not starve one for the > other. Its either feast or famine or so tests have shown. > > > > Neil. > > > > > > Joe Little wrote: > > > I have historically noticed that in ZFS, when ever there is a heavy > > > writer to a pool via NFS, the reads can held back (basically paused). > > > An example is a RAID10 pool of 6 disks, whereby a directory of files > > > including some large 100+MB in size being written can cause other > > > clients over NFS to pause for seconds (5-30 or so). This on B70 bits. > > > I've gotten used to this behavior over NFS, but didn't see it perform > > > as such when on the server itself doing similar actions. > > > > > > To improve upon the situation, I thought perhaps I could dedicate a > > > log device outside the pool, in the hopes that while heavy writes went > > > to the log device, reads would merrily be allowed to coexist from the > > > pool itself. My test case isn't ideal per se, but I added a local 9GB > > > SCSI (80) drive for a log, and added to LUNs for the pool itself. > > > You'll see from the below that while the log device is pegged at > > > 15MB/sec (sd5), my directory list request on devices sd15 and sd16 > > > never are answered. I tried this with both no-cache-flush enabled and > > > off, with negligible difference. Is there anyway to force a better > > > balance of reads/writes during heavy writes? > > > > > > extended device statistics > > > device r/s w/s kr/s kw/s wait actv svc_t %w %b > > > fd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > > sd0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > > sd1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > > sd2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > > sd3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > > sd4 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > > sd5 0.0 118.0 0.0 15099.9 0.0 35.0 296.7 0 100 > > > sd6 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > > sd7 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > > sd8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > > sd9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > > sd10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > > sd11 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > > sd12 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > > sd13 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > > sd14 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > > sd15 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > > sd16 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > ... > > > _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss