On 07/11/2012 10:50 AM, Ferenc-Levente Juhos wrote:
> Actually although as you pointed out that the chances to have an sha256
> collision is minimal, but still it can happen, that would mean
> that the dedup algorithm discards a block that he thinks is a duplicate.
> Probably it's anyway better to do a byte to byte comparison
> if the hashes match to be sure that the blocks are really identical.
> The funny thing here is that ZFS tries to solve all sorts of data integrity
> issues with checksumming and healing, etc.,
> and on the other hand a hash collision in the dedup algorithm can cause
> loss of data if wrongly configured.
> Anyway thanks that you have brought up the subject, now I know if I will
> enable the dedup feature I must set it to sha256,verify.
Oh jeez, I can't remember how many times this flame war has been going
on on this list. Here's the gist: SHA-256 (or any good hash) produces a
near uniform random distribution of output. Thus, the chances of getting
a random hash collision are around 2^-256 or around 10^-77. If I asked
you to pick two atoms at random *from the entire observable universe*,
your chances of hitting on the same atom are higher than the chances of
that hash collision. So leave dedup=on with sha256 and move on.
zfs-discuss mailing list