Thanks, Darren.

I've taken the liberty of reordering my follow-up for clarity. Just to be clear, I'm not critiquing ZFS, just trying to learn it by pushing at some of the corner cases of filesystem-system interactions. I'm trying to figure out the implications of various ZFS features when used in various ways.
Does that mean that paging out dirty mmap pages go to new places and
require metadata updates as well?
Yes.
Indeed, given that ZFS always writes new places (except for uberblock as you noted), then that does make the snapshots easier and accounts for there being no explicit code to "set COW on disk blocks", or such.

From a ufs/vxfs background, the thought of allocating additional disk storage to page out to a memory mapped file or of changing metadata to point to new blocks and hence needing to write out metadata as part of paging out modified file data seems foreign to me. If the metadata writes can be bundled into the same transaction, perhaps there's no more serialization latency on a pageout... ? Is this also another case where one might get ENOSPC when one doesn't on other filesystems (paging out to an existing MMF in a full ZFS pool)?

From the comments in the source tour about the ZIL, I did note the statement that file contents do not go through the ZIL unless needed for O_DSYNC or fsynch() semantics, so I wasn't sure how else they might be different.
How do snapshots interact with open files or files with pages in the
OpenSolaris page cache?

I don't believe they do.  Are you thinking of something in particular?
I am generally interested in understanding file consistency and cache coherency. I'd like to know what exactly is being snapshotted and what is consistent within a snapshot.

I subsequently saw an earlier thread on "ZFS consistency guarantee" (http://www.opensolaris.org/jive/thread.jspa;?messageID=124809) where you and others pointed out that application state is not consistent at a snapshot unless the application has been quiesced or otherwise brought to a consistent state. Even then, I'm curious about the interaction with the OpenSolaris page cache...

As is generally known and is explained well by Roch Bourbonnais in http://blogs.sun.com/roch/entry/nfs_and_zfs_a_fine, NFS places an extra requirement for committing writes to stable storage upon file close. For local filesystems, a close() will complete without all modified file data being written to disk yet. Does all such file data get into a snapshot, or only data that has happened to be pushed out to disk by the time of the snapshot? (e.g. local open(), write(), close(), snapshot).

It does look to me from the comment and call to zil_suspend from within dmu_objset_snapshot_one that any changes that have made it to the filesystem will get flushed out and included in the snapshot. This should apply to any metadata operations that have completed such as link, unlink, etc. But if the VM system is caching file contents after a close (or at least nobody has pushed it out yet), is there any way to guarantee that such data makes it into the snapshot?

In my earlier question I was also thinking about about other cases such as MMF where application has written to page with or without msync() or open file after write() but no fsync(). Depending upon how data of closed files are handled, those may be moot.

Eric






_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to