> The client is using async writes, that include commits. Sync writes do not
> need commits.
Are you saying nfs commit operations sent by the client aren't always
reported by that script?
> What happens is that the ZFS transaction group commit occurs at more-or-less
> regular intervals, likely 5 seconds for more modern ZFS systems. When the
> commit occurs, any data that is in the ARC but not commited in a prior
> group gets sent to the ZIL. This is why you might see a very different amount
> ZIL activity relative to the expected write workload.
I thought this was the other way around, synchronous writes
immediately get copied to ZIL blocks (logbias may cause them to be
allocated from the data disks despite the existence of a separate log
device) in order to not require that a txg go through immediately to
service them, then a txg that includes the same data eventually goes
directly to the data disks, and then they get removed from the ZIL.
> With sync=disabled, the ZIL is not used, thus the commit response to the
> is a lie, breaking the covenant between the server and client.
I was not clear, the behavior with sync=disabled was exactly what I
expected it to be, my question was why the Com_t attribute was nonzero
with sync=standard, while Commits stayed zero. To be clear, while it
was in the middle of copying data, the "Commits" column always read
zero, never even one commit reported (in fact, Commits was always
zero, even when a copy completed). I would expect this to mean the
nfs client was issuing asynchronous writes without a single commit
request (eventually filling the server's cache and requiring the
server to do something to resolve it). I am copying a single large
file, 8GB, and usually interrupting it before it finishes, after I
have some data that matches the period where it is writing to the log
devices (takes a few seconds to get through the client and server
caches and start using the disks).
> Different applications can react to long commit times differently. In this
> we see 1.9 seconds for the commit versus about 400 microseconds for each
> async write. The cause of the latency of the commit is not apparent from any
> bandwidth measurements (eg zpool iostat) and you should consider looking
> more closely at the "iostat -x" latency to see if the log devices are
I am still unclear on the commit time being reported for zero reported
commit requests. Here is what I think are the log devices (I am
unsure how to translate the sd## mapping to solaris style
controller/target devices) under iostat -x 5 when they are being
written to heavily:
device r/s w/s kr/s kw/s wait actv svc_t %w %b
sd30 0.0 1073.1 0.0 72940.6 0.0 7.6 7.1 1 77
sd31 0.0 1075.3 0.0 73090.2 0.0 7.6 7.1 1 77
Considering their maximum write speed is reported as 90MB/s, and they
each wrote 70MB/s in that interval, I do not expect the svc_t to
reflect the seek time they can achieve. There were no particularly
suspicious spikes in svc_t on any device while the Com_t was large. I
have a guess that your script is reporting the time taken to flush a
cache that got filled with async writes, and needed to be emptied to
be able to continue to service async writes (a lot of data, therefore
a lot of time).
I'm wondering if this belongs in nfs-discuss, if the nfs server issues
synchronous writes to empty its cache (without receiving an nfs commit
request), it seems to explain everything.
zfs-discuss mailing list