On 07/11/2012 10:50 AM, Ferenc-Levente Juhos wrote: > Actually although as you pointed out that the chances to have an sha256 > collision is minimal, but still it can happen, that would mean > that the dedup algorithm discards a block that he thinks is a duplicate. > Probably it's anyway better to do a byte to byte comparison > if the hashes match to be sure that the blocks are really identical. > > The funny thing here is that ZFS tries to solve all sorts of data integrity > issues with checksumming and healing, etc., > and on the other hand a hash collision in the dedup algorithm can cause > loss of data if wrongly configured. > > Anyway thanks that you have brought up the subject, now I know if I will > enable the dedup feature I must set it to sha256,verify.
Oh jeez, I can't remember how many times this flame war has been going on on this list. Here's the gist: SHA-256 (or any good hash) produces a near uniform random distribution of output. Thus, the chances of getting a random hash collision are around 2^-256 or around 10^-77. If I asked you to pick two atoms at random *from the entire observable universe*, your chances of hitting on the same atom are higher than the chances of that hash collision. So leave dedup=on with sha256 and move on. Cheers, -- Saso _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss