comments at the bottom... On Oct 23, 2010, at 1:48 AM, Erik Trimble wrote:
> On 10/22/2010 8:44 PM, Haudy Kazemi wrote: >> Never Best wrote: >>> Sorry I couldn't find this anywhere yet. For deduping it is best to have >>> the lookup table in RAM, but I wasn't too sure how much RAM is suggested? >>> >>> ::Assuming 128KB Block Sizes, and 100% unique data: >>> 1TB*1024*1024*1024/128 = 8388608 Blocks >>> ::Each Block needs 8 byte pointer? >>> 8388608*8 = 67108864 bytes >>> ::Ram suggest per TB >>> 67108864/1024/1024 = 64MB >>> >>> So if I understand correctly we should have a min of 64MB RAM per TB for >>> deduping? *hopes my math wasn't way off*, or is there significant extra >>> overhead stored per block for the lookup table? For example is there some >>> kind of redundancy on the lookup table (relation to RAM space requirments) >>> to counter corruption? >>> >>> I read some articles and they all mention that there is significant >>> performance loss if the table isn't in RAM, but none really mentioned how >>> much RAM one should have per TB of duping. >>> >>> Thanks, hope someone can confirm *or give me the real numbers* for me. I >>> know blocksize is variable; I'm most interessted in the default zfs setup >>> right now. >> There were several detailed discussions about this over the past 6 months >> that should be in the archives. I believe most of the info came from >> Richard Elling. >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > Look for both my name and Richard's, going back about a year. In particular, > this thread started out a good data flow: > > http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg35349.html > > > bottom line: 270 bytes per record Sometimes we see bigger sizes, but you have to have a lot of references before the DDT entry gets bigger than 512 bytes. Or, another way to look at this is: for every record, you will be updating 512 bytes (or the minimum sector size). This is why you'll hear me say that dedup changes big I/O into little I/O, but it doesn't eliminate I/O. Fortunately, modern SSDs do little I/O well. Unfortunately, HDDs are better optimized for big I/O and are lousy for little I/O. > so, for 4k record size, that works out to be 67GB per 1 TB of unique data. > 128k record size means about 2GB per 1 TB. Divide by 4 because the DDT is considered metadata and the metadata limit is 1/4 of ARC size. Yes, there is an open bug on this. No, it didn't make b147. Yes, it is a trivial fix and can be tuned in the field. > dedup means buy a (big) SSD for L2ARC. L2ARC directory entries take space, too. SWAG around 200 bytes for each L2ARC record. -- richard -- OpenStorage Summit, October 25-27, Palo Alto, CA http://nexenta-summit2010.eventbrite.com USENIX LISA '10 Conference, November 7-12, San Jose, CA ZFS and performance consulting http://www.RichardElling.com _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss