On 07/11/2012 03:58 PM, Edward Ned Harvey wrote:
>> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
>> boun...@opensolaris.org] On Behalf Of Sašo Kiselkov
>> I really mean no disrespect, but this comment is so dumb I could swear
>> my IQ dropped by a few tenths of a point just by reading.
> Cool it please. You say "I mean no disrespect" and then say something which
> is clearly disrespectful.
I sort of flew off the handle there, and I shouldn't have. It felt like
Tomas was misrepresenting my position and putting words in my mouth I
didn't say. I certainly didn't mean to diminish the validity of an
> Tomas's point is to illustrate that hashing is a many-to-one function. If
> it were possible to rely on the hash to always be unique, then you could use
> it as a compression algorithm. He's pointing out that's insane. His
> comment was not in the slightest bit dumb; if anything, it seems like maybe
> somebody (or some people) didn't get his point.
I understood his point very well and I never argued that hashing always
results in unique hash values, which is why I thought he was
misrepresenting what I said.
So for a full explanation of why hashes aren't usable for compression:
1) they are one-way (kind of bummer for decompression)
2) they operate far below the Shannon limit (i.e. unusable for
3) their output is pseudo-random, so even if we find collisions, we
have no way to distinguish which input was the most likely one meant
for a given hash value (all are equally probable)
A formal proof would of course take longer to construct and would take
time that I feel is best spent writing code.
zfs-discuss mailing list