On Nov 16, 2007 9:13 PM, Neil Perrin <[EMAIL PROTECTED]> wrote:
> Joe,
>
> I don't think adding a slog helped in this case. In fact I
> believe it made performance worse. Previously the ZIL would be
> spread out over all devices but now all synchronous traffic
> is directed at one device (and everything is synchronous in NFS).
> Mind you 15MB/s seems a bit on the slow side - especially is
> cache flushing is disabled.
>
> It would be interesting to see what all the threads are waiting
> on. I think the problem maybe that everything is backed
> up waiting to start a transaction because the txg train is
> slow due to NFS requiring the ZIL to push everything synchronously.
>

I agree completely. The log (even though slow) was an attempt to
isolate writes away from the pool. I guess the question is how to
provide for async access for NFS. We may have 16, 32 or whatever
threads, but if a single writer keeps the ZIL pegged and prohibiting
reads, its all for nought. Is there anyway to tune/configure the
ZFS/NFS combination to balance reads/writes to not starve one for the
other. Its either feast or famine or so tests have shown.

> Neil.
>
>
> Joe Little wrote:
> > I have historically noticed that in ZFS, when ever there is a heavy
> > writer to a pool via NFS, the reads can held back (basically paused).
> > An example is a RAID10 pool of 6 disks, whereby a directory of files
> > including some large 100+MB in size being written can cause other
> > clients over NFS to pause for seconds (5-30 or so). This on B70 bits.
> > I've gotten used to this behavior over NFS, but didn't see it perform
> > as such when on the server itself doing similar actions.
> >
> > To improve upon the situation, I thought perhaps I could dedicate a
> > log device outside the pool, in the hopes that while heavy writes went
> > to the log device, reads would merrily be allowed to coexist from the
> > pool itself. My test case isn't ideal per se, but I added a local 9GB
> > SCSI (80) drive for a log, and added to LUNs for the pool itself.
> > You'll see from the below that while the log device is pegged at
> > 15MB/sec (sd5),  my directory list request on devices sd15 and sd16
> > never are answered. I tried this with both no-cache-flush enabled and
> > off, with negligible difference. Is there anyway to force a better
> > balance of reads/writes during heavy writes?
> >
> >                  extended device statistics
> > device    r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b
> > fd0       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
> > sd0       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
> > sd1       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
> > sd2       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
> > sd3       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
> > sd4       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
> > sd5       0.0  118.0    0.0 15099.9  0.0 35.0  296.7   0 100
> > sd6       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
> > sd7       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
> > sd8       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
> > sd9       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
> > sd10      0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
> > sd11      0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
> > sd12      0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
> > sd13      0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
> > sd14      0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
> > sd15      0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
> > sd16      0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0
> ...
>
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to