As I'm sure you're all aware, filesize in ZFS can differ greatly from
actual disk usage, depending on access patterns. e.g. truncating a 1M
file down to 1 byte still uses up about 130k on disk when
recordsize=128k. I'm aware that this is a result of ZFS's rather
different internals, and that it works well for normal usage, but this
can make things difficult for applications that wish to restrain their
own disk usage.

The particular application I'm working on that has such a problem is the
OpenAFS <http://www.openafs.org/> client, when it uses ZFS as the disk
cache partition. The disk cache is constrained to a user-configurable
size, and the amount of cache used is tracked by counters internal to
the OpenAFS client. Normally cache usage is tracked by just taking the
file length of a particular file in the cache, and rounding it up to the
next frsize boundary of the cache filesystem. This is obviously wrong
when ZFS is used, and so our cache usage tracking can get very
incorrect.  So, I have two questions which would help us fix this:

  1. Is there any interface to ZFS (or a configuration knob or
  something) that we can use from a kernel module to explicitly return a
  file to the more predictable size? In the above example, truncating a
  1M file (call it 'A') to 1b mkes it take up 130k, but if we create a
  new file (call it 'B') with that 1b in it, it only takes up about 1k.
  Is there any operation we can perform on file 'A' to make it take up
  less space without having to create a new file 'B'?

  The cache files are often truncated and overwritten with new data,
  which is why this can become a problem. If there was some way to
  explicitly signal to ZFS that we want a particular file to be put in a
  smaller block or something, that would be helpful. (I am mostly
  ignorant on ZFS internals; if there's somewhere that would have told
  me this information, let me know)

  2. Lacking 1., can anyone give an equation relating file length, max
  size on disk, and recordsize? (and any additional parameters needed).
  If we just have a way of knowing in advance how much disk space we're
  going to take up by writing a certain amount of data, we should be
  okay.

Or, if anyone has any other ideas on how to overcome this, it would be
welcomed.

-- 
Andrew Deason
adea...@sinenomine.net
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to