On 07/11/2012 04:19 PM, Gregg Wonderly wrote:
> But this is precisely the kind of "observation" that some people seem to miss 
> out on the importance of.  As Tomas suggested in his post, if this was true, 
> then we could have a huge compression ratio as well.  And even if there was 
> 10% of the bit patterns that created non-unique hashes, you could use the 
> fact that a block hashed to a known bit pattern that didn't have collisions, 
> to compress the other 90% of your data.
> I'm serious about this from a number of perspectives.  We worry about the 
> time it would take to reverse SHA or RSA hashes to passwords, not even 
> thinking that what if someone has been quietly computing all possible hashes 
> for the past 10-20 years into a database some where, with every 5-16 
> character password, and now has an instantly searchable hash-to-password 
> database.

This is something very well known in the security community as "rainbow
tables" and a common method to protect against it is via salting. Never
use a password hashing scheme which doesn't use salts for exactly the
reason you outlined above.

> Sometimes we ignore the scale of time, thinking that only the immediately 
> visible details are what we have to work with.
> If no one has computed the hashes for every single 4K and 8K block, then 
> fine.  But, if that was done, and we had that data, we'd know for sure which 
> algorithm was going to work the best for the number of bits we are 
> considering.

Do you even realize how many 4K or 8K blocks there are?!?! Exactly
2^32768 or 2^65536 respectively. I wouldn't worry about somebody having
those pre-hashed ;-) Rainbow tables only work for a very limited subset
of data.

> Speculating based on the theory of the algorithms for "random" number of bits 
> is just silly.  Where's the real data that tells us what we need to know?

If you don't trust math, then I there's little I can do to convince you.
But remember our conversation the next time you step into a car or get
on an airplane. The odds that you'll die on that ride are far higher
than that you'll find a random hash collision in a 256-bit hash algorithm...

zfs-discuss mailing list

Reply via email to