--- On Fri, 9/21/12, Sašo Kiselkov <skiselkov...@gmail.com> wrote:
> > I have a ZFS filesystem with compression turned
> on. Does the "used" property show me the actual data
> size, or the compressed data size ? If it shows me the
> compressed size, where can I see the actual data size ?
> It shows the allocated number of bytes used by the
> filesystem, i.e.
> after compression. To get the uncompressed size, multiply
> "used" by
> "compressratio" (so for example if used=65G and
> then your decompressed size is 2.00 x 65G = 130G).
Ok, thank you. The problem with this is, the compressratio only goes to two
significant digits, which means if I do the math, I'm only getting an
approximation. Since we may use these numbers to compute billing, it is
important to get it right.
Is there any way at all to get the real *exact* number ?
> > Later, I enabled dedup for just a single filesystem on
> this pool:
> > zfs set dedup=on pool/dataset
> > and now, I see in 'zpool list' a value for dedupratio:
> > pool dedupratio
> 1.65x -
> > Why do I see a value here ? Isn't dedupe still
> OFF for the pool as a whole ? I do NOT want to enable
> dedupe for the entire pool.
> That's because dedup operates at the block level, not the
> object level, i.e. it kicks into effect once the data passes
> through the
> filesystem layers and gets subdivided into disk blocks. The
> point is
> that de-duplication (in a sense) allows you to de-duplicate
> across multiple filesystems. Take for instance the following
> NAME DEDUP
> --------- -----
> /tank/fsA on
> /tank/fsB off
> /tank/fsC on
> /tank/fsD off
> /tank/fsE off
> Here ZFS will try to deduplicate the blocks in fsA not only
> in regards
> to other blocks in fsA, but also in regards to fsC.
Ok. So the dedupratio I see for the entire pool is "dedupe ratio for
filesystems in this pool that have dedupe enabled" ... yes ?
> > Also, why do I not see any dedupe stats for the
> individual filesystem ? I see compressratio, and I see
> dedup=on, but I don't see any dedupratio for the filesystem
Ok, getting back to precise accounting ... if I turn on dedupe for a particular
filesystem, and then I multiply the "used" property by the compressratio
property, and calculate the real usage, do I need to do another calculation to
account for the deduplication ? Or does the "used" property not take into
account deduping ?
> > Did turning on dedupe for a single filesystem turn it
> on for the entire pool ?
> In a sense, yes. The dedup machinery is pool-wide, but only
> writes from
> filesystems which have dedup enabled enter it. The rest
> simply pass it
> by and work as usual.
Ok - but from a performance point of view, I am only using ram/cpu resources
for the deduping of just the individual filesystems I enabled dedupe on, right
? I hope that turning on dedupe for just one filesystem did not incur ram/cpu
costs across the entire pool...
zfs-discuss mailing list