On 07/11/2012 01:36 PM, casper....@oracle.com wrote: > > >> This assumes you have low volumes of deduplicated data. As your dedup >> ratio grows, so does the performance hit from dedup=verify. At, say, >> dedupratio=10.0x, on average, every write results in 10 reads. > > I don't follow. > > If dedupratio == 10, it means that each item is *referenced* 10 times > but it is only stored *once*. Only when you have hash collisions then > multiple reads would be needed. > > Only one read is needed except in the case of hash collisions.
No, *every* dedup write will result in a block read. This is how: 1) ZIO gets block X and computes HASH(X) 2) ZIO looks up HASH(X) in DDT 2a) HASH(X) not in DDT -> unique write; exit 2b) HASH(X) in DDT; continue 3) Read original disk block Y with HASH(Y) = HASH(X) <--here's the read 4) Verify X == Y 4a) X == Y; increment refcount 4b) X != Y; hash collision; write new block <-- here's the collision So in other words, by the time you figure out you've got a hash collision, you already did the read, ergo, every dedup write creates a read! -- Saso _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss