On 07/11/2012 03:58 PM, Edward Ned Harvey wrote:
>> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
>> boun...@opensolaris.org] On Behalf Of Sašo Kiselkov
>> I really mean no disrespect, but this comment is so dumb I could swear
>> my IQ dropped by a few tenths of a point just by reading.
> Cool it please.  You say "I mean no disrespect" and then say something which
> is clearly disrespectful.

I sort of flew off the handle there, and I shouldn't have. It felt like
Tomas was misrepresenting my position and putting words in my mouth I
didn't say. I certainly didn't mean to diminish the validity of an
honest question.

> Tomas's point is to illustrate that hashing is a many-to-one function.  If
> it were possible to rely on the hash to always be unique, then you could use
> it as a compression algorithm.  He's pointing out that's insane.  His
> comment was not in the slightest bit dumb; if anything, it seems like maybe
> somebody (or some people) didn't get his point.

I understood his point very well and I never argued that hashing always
results in unique hash values, which is why I thought he was
misrepresenting what I said.

So for a full explanation of why hashes aren't usable for compression:

 1) they are one-way (kind of bummer for decompression)
 2) they operate far below the Shannon limit (i.e. unusable for
    lossless compression)
 3) their output is pseudo-random, so even if we find collisions, we
    have no way to distinguish which input was the most likely one meant
    for a given hash value (all are equally probable)

A formal proof would of course take longer to construct and would take
time that I feel is best spent writing code.

zfs-discuss mailing list

Reply via email to