comments at the bottom...

On Oct 23, 2010, at 1:48 AM, Erik Trimble wrote:

> On 10/22/2010 8:44 PM, Haudy Kazemi wrote:
>> Never Best wrote:
>>> Sorry I couldn't find this anywhere yet.  For deduping it is best to have 
>>> the lookup table in RAM, but I wasn't too sure how much RAM is suggested?
>>> 
>>> ::Assuming 128KB Block Sizes, and 100% unique data:
>>> 1TB*1024*1024*1024/128 = 8388608 Blocks
>>> ::Each Block needs 8 byte pointer?
>>> 8388608*8 = 67108864 bytes
>>> ::Ram suggest per TB
>>> 67108864/1024/1024 = 64MB
>>> 
>>> So if I understand correctly we should have a min of 64MB RAM per TB for 
>>> deduping? *hopes my math wasn't way off*, or is there significant extra 
>>> overhead stored per block for the lookup table?  For example is there some 
>>> kind of redundancy on the lookup table (relation to RAM space requirments) 
>>> to counter corruption?
>>> 
>>> I read some articles and they all mention that there is significant 
>>> performance loss if the table isn't in RAM, but none really mentioned how 
>>> much RAM one should have per TB of duping.
>>> 
>>> Thanks, hope someone can confirm *or give me the real numbers* for me.  I 
>>> know blocksize is variable; I'm most interessted in the default zfs setup 
>>> right now.
>> There were several detailed discussions about this over the past 6 months 
>> that should be in the archives.  I believe most of the info came from 
>> Richard Elling.
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> 
> Look for both my name and Richard's, going back about a year. In particular, 
> this thread started out a good data flow:
> 
> http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg35349.html
> 
> 
> bottom line: 270 bytes per record

Sometimes we see bigger sizes, but you have to have a lot of references before
the DDT entry gets bigger than 512 bytes.  Or, another way to look at this is: 
for
every record, you will be updating 512 bytes (or the minimum sector size).  This
is why you'll hear me say that dedup changes big I/O into little I/O, but it 
doesn't
eliminate I/O.  Fortunately, modern SSDs do little I/O well. Unfortunately, 
HDDs 
are better optimized for big I/O and are lousy for little I/O. 

> so, for 4k record size, that  works out to be 67GB per 1 TB of unique data. 
> 128k record size means about 2GB per 1 TB.

Divide by 4 because the DDT is considered metadata and the metadata limit
is 1/4 of ARC size.  Yes, there is an open bug on this. No, it didn't make b147.
Yes, it is a trivial fix and can be tuned in the field.

> dedup means buy a (big) SSD for L2ARC.

L2ARC directory entries take space, too.  SWAG around 200 bytes for each L2ARC
record.
 -- richard

-- 
OpenStorage Summit, October 25-27, Palo Alto, CA
http://nexenta-summit2010.eventbrite.com
USENIX LISA '10 Conference, November 7-12, San Jose, CA
ZFS and performance consulting
http://www.RichardElling.com













_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to