Downside you have described happens only when the same checksum is used for data protection and duplicate detection. This implies sha256, BTW, since fletcher-based dedupe has been dropped in recent builds.
On 12/17/09, Kjetil Torgrim Homme <kjeti...@linpro.no> wrote: > Andrey Kuzmin <andrey.v.kuz...@gmail.com> writes: >> Darren J Moffat wrote: >>> Andrey Kuzmin wrote: >>>> Resilvering has noting to do with sha256: one could resilver long >>>> before dedupe was introduced in zfs. >>> >>> SHA256 isn't just used for dedup it is available as one of the >>> checksum algorithms right back to pool version 1 that integrated in >>> build 27. >> >> 'One of' is the key word. And thanks for code pointers, I'll take a >> look. > > I didn't mention sha256 at all :-). the reasoning is the same no matter > what hash algorithm you're using (fletcher2, fletcher4 or sha256. dedup > doesn't require sha256 either, you can use fletcher4. > > the question was: why does data have to be compressed before it can be > recognised as a duplicate? it does seem like a waste of CPU, no? I > attempted to show the downsides to identifying blocks by their > uncompressed hash. (BTW, it doesn't affect storage efficiency, the same > duplicate blocks will be discovered either way.) > > -- > Kjetil T. Homme > Redpill Linpro AS - Changing the game > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > -- Regards, Andrey _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss