On Sep 25, 2012, at 11:17 AM, Jason Usher <jushe...@yahoo.com> wrote:
> Ok - but from a performance point of view, I am only using
> ram/cpu resources for the deduping of just the individual
> filesystems I enabled dedupe on, right ? I hope that
> turning on dedupe for just one filesystem did not incur
> ram/cpu costs across the entire pool...
> It depends. -- richard
> Can you elaborate at all ? Dedupe can have fairly profound performance
> implications, and I'd like to know if I am paying a huge price just to get a
> dedupe on one little filesystem ...
The short answer is: "deduplication transforms big I/Os into small I/Os,
but does not eliminate I/O." The reason is that the deduplication table has
to be updated when you write something that is deduplicated. This implies
that storage devices which are inexpensive in $/GB but expensive in $/IOPS
might not be the best candidates for deduplication (eg. HDDs). There is some
additional CPU overhead for the sha-256 hash that might or might not be
noticeable, depending on your CPU. But perhaps the most important factor
is your data -- is it dedupable and are the space savings worthwhile? There
is no simple answer for that, but we generally recommend that you simulate
dedup before committing to it.
illumos Day & ZFS Day, Oct 1-2, 2012 San Fransisco
zfs-discuss mailing list