> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Nico Williams
> To decide if a block needs dedup one would first check the Bloom
> filter, then if the block is in it, use the dedup code path, else the
> non-dedup codepath and insert the block in the Bloom filter.  

Sorry, I didn't know what a Bloom filter was before I replied before - Now I've 
read the wikipedia article and am consequently an expert.   *sic*   ;-)

It sounds like, what you're describing...  The first time some data gets 
written, it will not produce a hit in the Bloom filter, so it will get written 
to disk without dedup.  But now it has an entry in the Bloom filter.  So the 
second time the data block gets written (the first duplicate) it will produce a 
hit in the Bloom filter, and consequently get a dedup DDT entry.  But since the 
system didn't dedup the first one, it means the second one still needs to be 
written to disk independently of the first one.  So in effect, you'll always 
"miss" the first duplicated block write, but you'll successfully dedup n-1 
duplicated blocks.  Which is entirely reasonable, although not strictly 
optimal.  And sometimes you'll get a false positive out of the Bloom filter, so 
sometimes you'll be running the dedup code on blocks which are actually unique, 
but with some intelligently selected parameters such as Bloom table size, you 
can get this probability to be reasonably small, like less tha
 n 1%.

In the wikipedia article, they say you can't remove an entry from the Bloom 
filter table, which would over time cause consistent increase of false positive 
probability (approaching 100% false positives) from the Bloom filter and 
consequently high probability of dedup'ing blocks that are actually unique; but 
with even a minimal amount of thinking about it, I'm quite sure that's a 
solvable implementation detail.  Instead of storing a single bit for each entry 
in the table, store a counter.  Every time you create a new entry in the table, 
increment the different locations; every time you remove an entry from the 
table, decrement.  Obviously a counter requires more bits than a bit, but it's 
a linear increase of size, exponential increase of utility, and within the 
implementation limits of available hardware.  But there may be a more 
intelligent way of accomplishing the same goal.  (Like I said, I've only 
thought about this minimally).

Meh, well.  Thanks for the interesting thought.  For whatever it's worth.

zfs-discuss mailing list

Reply via email to