On 7/9/2010 5:18 PM, Brandon High wrote:
On Fri, Jul 9, 2010 at 5:00 PM, Edward Ned Harvey
<solar...@nedharvey.com <mailto:solar...@nedharvey.com>> wrote:
The default ZFS block size is 128K. If you have a filesystem with
128G used, that means you are consuming 1,048,576 blocks, each of
which must be checksummed. ZFS uses adler32 and sha256, which
means 4bytes and 32bytes ... 36 bytes * 1M blocks = an extra 36
Mbytes and some fluff consumed by enabling dedup.
I suspect my numbers are off, because 36Mbytes seems impossibly
small. But I hope some sort of similar (and more correct) logic
will apply. ;-)
I think that DDT entries are a little bigger than what you're using.
The size seems to range between 150 and 250 bytes depending on how
it's calculated, call it 200b each. Your 128G dataset would require
closer to 200M (+/- 25%) for the DDT if your data was completely
unique. 1TB of unique data would require 600M - 1000M for the DDT.
The numbers are fuzzy of course, and assum only 128k blocks. Lots of
small files will increase the memory cost of dedupe, and using it on a
zvol that has the default block size (8k) would require 16 times the
memory.
-B
Go back and read several threads last month about ZFS/L2ARC memory usage
for dedup. In particular, I've been quite specific about how to
calculate estimated DDT size. Richard has also been quite good at
giving size estimates (as well as explaining how to see current block
size usage in a filesystem).
The structure in question is this one:
ddt_entry
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/sys/ddt.h#108
I'd have to fire up an IDE to track down all the sizes of the ddt_entry
structure's members, but I feel comfortable using Richard's 270
bytes-per-entry estimate.
--
Erik Trimble
Java System Support
Mailstop: usca22-123
Phone: x17195
Santa Clara, CA
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss