2011-12-12 19:03, Pawel Jakub Dawidek пишет:
On Sun, Dec 11, 2011 at 04:04:37PM +0400, Jim Klimov wrote:
I would not be surprised to see that there is some disk IO
adding delays for the second case (read of a deduped file
"clone"), because you still have to determine references
to this second file's blocks, and another path of on-disk
blocks might lead to it from a separate inode in a separate
dataset (or I might be wrong). Reading this second path of
pointers to the same cached data blocks might decrease speed
As I said, ZFS reading path involves no dedup code. No at all.
I am not sure if we contradicted each other ;)
What I meant was that the ZFS reading path involves reading
logical data blocks at some point, consulting the cache(s)
if the block is already cached (and up-to-date). These blocks
are addressed by some unique ID, and now with dedup there are
several pointers to same block.
So, basically, reading a file involves reading ZFS metadata,
determining data block IDs, fetching them from disk or cache.
Indeed, this does not need to be dedup-aware; but if the other
chain of metadata blocks points to same data or metadata blocks
which were already cached (for whatever reason, not limited to
dedup) - this is where the read-speed boost appears.
Likewise, if some blocks are not cached, such as metadata
needed to determine the second file's block IDs, this incurs
disk IOs and may decrease overall speed.
That's why I proposed redoing the test with re-reading both
files - now all relevant data and metadata would be cached
and we might see a bit faster read speed.
Just for kicks ;)
zfs-discuss mailing list