What I'm saying is that I am getting conflicting information from your
I (and others) say there will be collisions that will cause data loss if verify
You say it would be so rare as to be impossible from your perspective.
Tomas says, well then lets just use the hash value for a 4096X compression.
You fluff around his argument calling him names.
I say, well then compute all the possible hashes for all possible bit patterns
and demonstrate no dupes.
You say it's not possible to do that.
I illustrate a way that loss of data could cost you money.
You say it's impossible for there to be a chance of me constructing a block
that has the same hash but different content.
Several people have illustrated that 128K to 32bits is a huge and lossy ratio
of compression, yet you still say it's viable to leave verify off.
I say, in fact that the total number of unique patterns that can exist on any
pool is small, compared to the total, illustrating that I understand how the
key space for the algorithm is small when looking at a ZFS pool, and thus could
have a non-collision opportunity.
So I can see what perspective you are drawing your confidence from, but I, and
others, are not confident that the risk has zero probability.
I'm pushing you to find a way to demonstrate that there is zero risk because if
you do that, then you've, in fact created the ultimate compression factor (but
enlarged the keys that could collide because the pool is now virtually larger),
to date for random bit patterns, and you've also demonstrated that the
particular algorithm is very good for dedup.
That would indicate to me, that you can then take that algorithm, and run it
inside of ZFS dedup to automatically manage when verify is necessary by
detecting when a collision occurs.
I appreciate the push back. I'm trying to drive thinking about this into the
direction of what is known and finite, away from what is infinitely complex and
thus impossible to explore.
Maybe all the work has already been done…
On Jul 11, 2012, at 11:02 AM, Sašo Kiselkov wrote:
> On 07/11/2012 05:58 PM, Gregg Wonderly wrote:
>> You're entirely sure that there could never be two different blocks that can
>> hash to the same value and have different content?
>> Wow, can you just send me the cash now and we'll call it even?
> You're the one making the positive claim and I'm calling bullshit. So
> the onus is on you to demonstrate the collision (and that you arrived at
> it via your brute force method as described). Until then, my money stays
> safely on my bank account. Put up or shut up, as the old saying goes.
zfs-discuss mailing list