On 20/07/2010 04:41, Edward Ned Harvey wrote:
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Richard L. Hamilton
I would imagine that if it's read-mostly, it's a win, but
otherwise it costs more than it saves. Even more conventional
compression tends to be more resource intensive than decompression...
I would imagine it's *easier* to have a win when it's read-mostly, but the
expense of computing checksums is going to be done either way, with or
without dedup. The only extra cost dedup adds is to maintain a hash tree of
some kind, to see if some block has already been stored on disk. So ... of
course I'm speaking hypothetically and haven't been proven ... I think dedup
will accelerate the system in nearly all use cases.
The main exception is whenever you have highly non-duplicated data. I think
the cost of dedup CPU power is tiny little small, but in the case of highly
non-duplicated data, even that little expense is a waste.
Please note that by default ZFS uses fletcher4 checksums but dedup
currently allows only for sha256 which are more CPU intensive. Also from
a performance point of view there will be a sudden drop in write
performance the moment DDT can't fit entirely in a memory. L2ARC could
mitigate the impact though.
Then there will be less memory available for data caching due to extra
memory requirements for DDT.
(however please note that IIRC DDT is treated as meta data and by
default there is a limit of meta-data cache size to be no bigger than
20% of ARC - there is a bug open for it, I haven't checked if it's been
fixed yet or not).
What I'm wondering is when dedup is a better value than compression.
Whenever files have internal repetition, compression will be better.
Whenever the repetition crosses file barriers, dedup will be better.
Not necessarily. Compression in ZFS works only within a single fs block
scope.
So for example if you have a large file with most of its block identical
dedup should "compress" the file much better than a compression. Also
please note that you can use both: compression and dedup at the same time.
--
Robert Milkowski
http://milek.blogspot.com
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss