> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Nico Williams > > To decide if a block needs dedup one would first check the Bloom > filter, then if the block is in it, use the dedup code path, else the > non-dedup codepath and insert the block in the Bloom filter.
Sorry, I didn't know what a Bloom filter was before I replied before - Now I've read the wikipedia article and am consequently an expert. *sic* ;-) It sounds like, what you're describing... The first time some data gets written, it will not produce a hit in the Bloom filter, so it will get written to disk without dedup. But now it has an entry in the Bloom filter. So the second time the data block gets written (the first duplicate) it will produce a hit in the Bloom filter, and consequently get a dedup DDT entry. But since the system didn't dedup the first one, it means the second one still needs to be written to disk independently of the first one. So in effect, you'll always "miss" the first duplicated block write, but you'll successfully dedup n-1 duplicated blocks. Which is entirely reasonable, although not strictly optimal. And sometimes you'll get a false positive out of the Bloom filter, so sometimes you'll be running the dedup code on blocks which are actually unique, but with some intelligently selected parameters such as Bloom table size, you can get this probability to be reasonably small, like less tha n 1%. In the wikipedia article, they say you can't remove an entry from the Bloom filter table, which would over time cause consistent increase of false positive probability (approaching 100% false positives) from the Bloom filter and consequently high probability of dedup'ing blocks that are actually unique; but with even a minimal amount of thinking about it, I'm quite sure that's a solvable implementation detail. Instead of storing a single bit for each entry in the table, store a counter. Every time you create a new entry in the table, increment the different locations; every time you remove an entry from the table, decrement. Obviously a counter requires more bits than a bit, but it's a linear increase of size, exponential increase of utility, and within the implementation limits of available hardware. But there may be a more intelligent way of accomplishing the same goal. (Like I said, I've only thought about this minimally). Meh, well. Thanks for the interesting thought. For whatever it's worth. _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss