Downside you have described happens only when the same checksum is
used for data protection and duplicate detection. This implies sha256,
BTW, since fletcher-based dedupe has been dropped in recent builds.

On 12/17/09, Kjetil Torgrim Homme <kjeti...@linpro.no> wrote:
> Andrey Kuzmin <andrey.v.kuz...@gmail.com> writes:
>> Darren J Moffat wrote:
>>> Andrey Kuzmin wrote:
>>>> Resilvering has noting to do with sha256: one could resilver long
>>>> before dedupe was introduced in zfs.
>>>
>>> SHA256 isn't just used for dedup it is available as one of the
>>> checksum algorithms right back to pool version 1 that integrated in
>>> build 27.
>>
>> 'One of' is the key word. And thanks for code pointers, I'll take a
>> look.
>
> I didn't mention sha256 at all :-).  the reasoning is the same no matter
> what hash algorithm you're using (fletcher2, fletcher4 or sha256.  dedup
> doesn't require sha256 either, you can use fletcher4.
>
> the question was: why does data have to be compressed before it can be
> recognised as a duplicate?  it does seem like a waste of CPU, no?  I
> attempted to show the downsides to identifying blocks by their
> uncompressed hash.  (BTW, it doesn't affect storage efficiency, the same
> duplicate blocks will be discovered either way.)
>
> --
> Kjetil T. Homme
> Redpill Linpro AS - Changing the game
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>


-- 
Regards,
Andrey
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to