On Tue, Jul 8, 2008 at 1:26 PM, Bob Friesenhahn <[EMAIL PROTECTED]> wrote: > Something else came to mind which is a negative regarding > deduplication. When zfs writes new sequential files, it should try to > allocate blocks in a way which minimizes "fragmentation" (disk seeks). > Disk seeks are the bane of existing storage systems since they come > out of the available IOPS budget, which is only a couple hundred > ops/second per drive. The deduplication algorithm will surely result > in increasing effective fragmentation (decreasing sequential > performance) since duplicated blocks will result in a seek to the > master copy of the block followed by a seek to the next block. Disk > seeks will remain an issue until rotating media goes away, which (in > spite of popular opinion) is likely quite a while from now. > > Someone has to play devil's advocate here. :-)
With L2ARC on SSD, seeks are free and IOPs are quite cheap (compared to spinning rust). Cold reads may be a problem, but there is a reasonable chance that L2ARC sizing can be helpful here. Also, the blocks that are likely to be duplicate are going to be the same file but just with a different offset. That is, this file is going to be the same in every one of my LDom disk images. # du -h /usr/jdk/instances/jdk1.5.0/jre/lib/rt.jar 38M /usr/jdk/instances/jdk1.5.0/jre/lib/rt.jar There is a pretty good chance that the first copy will be sequential and as a result all of the deduped copies would be sequential as well. What's more - it is quite likely to be in the ARC or L2ARC. -- Mike Gerdts http://mgerdts.blogspot.com/ _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss