On 07/11/2012 01:09 PM, Justin Stringfellow wrote: >> The point is that hash functions are many to one and I think the point >> was about that verify wasn't really needed if the hash function is good >> enough. > > This is a circular argument really, isn't it? Hash algorithms are never > perfect, but we're trying to build a perfect one? > > It seems to me the obvious fix is to use hash to identify candidates for > dedup, and then do the actual verify and dedup asynchronously. Perhaps a > worker thread doing this at low priority? > Did anyone consider this?
This assumes you have low volumes of deduplicated data. As your dedup ratio grows, so does the performance hit from dedup=verify. At, say, dedupratio=10.0x, on average, every write results in 10 reads. -- Saso _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss