[zfs-discuss] Resolving performance issue w/ deduplication (NexentaStor)

Ray Van Dolson Thu, 29 Dec 2011 22:33:13 -0800

Hi all;

We have a dev box running NexentaStor Community Edition 3.1.1 w/ 24GB
(we don't run dedupe on production boxes -- and we do pay for Nexenta
licenses on prd as well) RAM and an 8.5TB pool with deduplication
enabled (1.9TB or so in use).  Dedupe ratio is only 1.26x.


The box has an SLC-based SSD as ZIL and a 300GB MLC SSD as L2ARC.

The box has been performing fairly poorly lately, and we're thinking
it's due to deduplication:

  # echo "::arc" | mdb -k | grep arc_meta
  arc_meta_used             =      5884 MB
  arc_meta_limit            =      5885 MB
  arc_meta_max              =      5888 MB

  # zpool status -D
  ...
  DDT entries 24529444, size 331 on disk, 185 in core

So, not only are we using up all of our metadata cache, but the DDT
table is taking up a pretty significant chunk of that (over 70%).

ARC sizing is as follows:

  p                         =     15331 MB
  c                         =     16354 MB
  c_min                     =      2942 MB
  c_max                     =     23542 MB
  size                      =     16353 MB

I'm not really sure how to determine how many blocks are on this zpool
(is it the same as the # of DDT entries? -- deduplication has been on
since pool creation).  If I use a 64KB block size average, I get about
31 million blocks, but DDT entries are 24 million ....

zdb -DD and zdb -bb | grep 'bp count" both do not complete (zdb says
I/O error).  Probably because the pool is in use and is quite busy.

Without the block count I'm having a hard time determining how much
memory we _should_ have.  I can only speculate that it's "more" at this
point. :)

If I assume 24 million blocks is about accurate (from zpool status -D
output above), then at 320 bytes per block we're looking at about 7.1GB
for DDT table size.  We do have L2ARC, though I'm not sure how ZFS
decides what portion of the DDT stays in memory and what can go to
L2ARC -- if all of it went to L2ARC, then the references to this
information in arc_meta would be (at 176 bytes * 24million blocks)
around 4GB -- which again is a good chuck of arc_meta_max.

Given that our dedupe ratio on this pool is fairly low anyways, am
looking for strategies to back out.  Should we just disable
deduplication and then maybe bump up the size of the arc_meta_max?
Maybe also increase the size of arc.size as well (8GB left for the
system seems higher than we need)?

Is there a non-disruptive way to undeduplicate everything and expunge
the DDT?  zfs send/recv and then back perhaps (we have the extra
space)?

Thanks,
Ray

[1] http://markmail.org/message/db55j6zetifn4jkd
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Resolving performance issue w/ deduplication (NexentaStor)

Reply via email to