On Thu, Dec 03, 2009 at 12:44:16PM -0800, Per Baatrup wrote:
> >if any of f2..f5 have different block sizes from f1
> 
> This restriction does not sound so bad to me if this only refers to
> changes to the blocksize of a particular ZFS filesystem or copying
> between different ZFSes in the same pool. This can properly be managed
> with a "-f" switch on the userlan app to force the copy when it would
> fail.

Why expose such details?

If you have dedup on and if the file blocks and sizes align then

    cat f1 f2 f3 f4 f5 > f6

will do the right thing and consume only space for new metadata.

If the file blocks and sizes do not align then

    cat f1 f2 f3 f4 f5 > f6

will still work correctly.

Or do you mean that you want a way to do that cat ONLY if it would
consume no new space for data?  (That might actually be a good
justification for a ZFS cat command, though I think, too, that one could
script it.)

> >any of f1..f5's last blocks are partial
> 
> Does this mean that f1,f2,f3,f4 needs to be exact multiplum of the ZFS
> blocksize? This is a severe restriction that will fail unless in very
> special cases.

Say f1 is 1MB, f2 is 128KB, f3 is 510 bytes, f4 is 514 bytes, and f5 is
10MB, and the recordsize for their containing datasets is 128KB, then
the new file will consume 10MB + 128KB more than f1..f5 did, but 1MB +
128KB will be de-duplicated.

This is not really "a severe restriction".  To make ZFS do better than
that would require much extra metadata and complexity in the filesystem
that users who don't need to do space-efficient file concatenation (most
users, that is) won't want to pay for.

> Is this related to the disk format or is it restriction in the
> implrmentation? (do you know where to look in the source code?).

Both.

> >...but also ZFS most likely could not do any better with any other, more
> >specific non-dedup solution
> 
> Properly lots of I/O traffic, digest calculation+lookups, could be
> saved as we already know it will be a duplicate.  (In our case the
> files are gigabyte sizes)

ZFS hashes, and records hashes of blocks, not sub-blocks.  Look at my
above example.  To efficiently dedup the concatenation of the 10MB of f5
would require being able to have something like "sub-block pointers".
Alternatively, if you want a concatenation-specific feature ZFS would
have to have a metadata notion of concatentation, but then the Unix way
of concatenating files couldn't be used for this since the necessary
context is lost in the I/O redirection.

Nico
-- 
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to