You can treat whatever hash function as an idealized one, but actual
hash functions aren't. There may well be as-yet-undiscovered input
bit pattern ranges where there's a large density of collisions in some
hash function, and indeed, since our hash functions aren't ideal,
there must be. We just don't know where these potential collisions
are -- for cryptographically secure hash functions that's enough (plus
2nd pre-image and 1st pre-image resistance, but allow me to handwave),
but for dedup? *shudder*.
Now, for some content types collisions may not be a problem at all.
Think of security camera recordings: collisions will show up as bad
frames in a video stream that no one is ever going to look at, and if
they should need it, well, too bad.
And for other content types collisions can be horrible. Us ZFS lovers
love to talk about how silent bit rot means you may never know about
serious corruption in other filesystems until it's too late. Now, if
you disable verification in dedup, what do you get? The same
situation as other filesystems are in relative to bit rot, only with
Disabling verification is something to do after careful deliberation,
not something to do by default.
zfs-discuss mailing list