Re: [zfs-discuss] New fast hash algorithm - is it needed?

Ferenc-Levente Juhos Wed, 11 Jul 2012 07:25:16 -0700

You don't need to reproduce all possible blocks.
1. SHA256 produces a 256 bit hash
2. That means it produces a value on 256 bits, in other words a value
between 0..2^256 - 1
3. If you start counting from 0 to 2^256 and for each number calculate the
SHA256 you will get at least one hash collision (if the hash algortihm is
prefectly distributed)
4. Counting from 0 to 2^256, is nothing else but reproducing all possible
bit pattern on 32 bytes


It's not about whether one computer is capable of producing the above
hashes or not, or whether there are actually that many unique 32 byte bit
patterns in the universe.
A collision can happen.

On Wed, Jul 11, 2012 at 3:57 PM, Gregg Wonderly <gr...@wonderly.org> wrote:

> Since there is a finite number of bit patterns per block, have you tried
> to just calculate the SHA-256 or SHA-512 for every possible bit pattern to
> see if there is ever a collision?  If you found an algorithm that produced
> no collisions for any possible block bit pattern, wouldn't that be the win?
>
> Gregg Wonderly
>
> On Jul 11, 2012, at 5:56 AM, Sašo Kiselkov wrote:
>
> > On 07/11/2012 12:24 PM, Justin Stringfellow wrote:
> >>> Suppose you find a weakness in a specific hash algorithm; you use this
> >>> to create hash collisions and now imagined you store the hash
> collisions
> >>> in a zfs dataset with dedup enabled using the same hash algorithm.....
> >>
> >> Sorry, but isn't this what dedup=verify solves? I don't see the problem
> here. Maybe all that's needed is a comment in the manpage saying hash
> algorithms aren't perfect.
> >
> > It does solve it, but at a cost to normal operation. Every write gets
> > turned into a read. Assuming a big enough and reasonably busy dataset,
> > this leads to tremendous write amplification.
> >
> > Cheers,
> > --
> > Saso
> > _______________________________________________
> > zfs-discuss mailing list
> > zfs-discuss@opensolaris.org
> > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] New fast hash algorithm - is it needed?

Reply via email to