Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

Nathan Kroenert Sun, 11 Dec 2011 03:12:10 -0800

 On 12/11/11 01:05 AM, Pawel Jakub Dawidek wrote:

On Wed, Dec 07, 2011 at 10:48:43PM +0200, Mertol Ozyoney wrote:

Unfortunetly the answer is no. Neither l1 nor l2 cache is dedup aware.


The only vendor i know that can do this is Netapp

And you really work at Oracle?:)

The answer is definiately yes. ARC caches on-disk blocks and dedup just
reference those blocks. When you read dedup code is not involved at all.
Let me show it to you with simple test:

Create a file (dedup is on):

        # dd if=/dev/random of=/foo/a bs=1m count=1024

Copy this file so that it is deduped:

        # dd if=/foo/a of=/foo/b bs=1m

Export the pool so all cache is removed and reimport it:

        # zpool export foo
        # zpool import foo

Now let's read one file:

        # dd if=/foo/a of=/dev/null bs=1m
        1073741824 bytes transferred in 10.855750 secs (98909962 bytes/sec)

We read file 'a' and all its blocks are in cache now. The 'b' file
shares all the same blocks, so if ARC caches blocks only once, reading
'b' should be much faster:

        # dd if=/foo/b of=/dev/null bs=1m
        1073741824 bytes transferred in 0.870501 secs (1233475634 bytes/sec)

Now look at it, 'b' was read 12.5 times faster than 'a' with no disk
activity. Magic?:)


Hey all,

That reminds me of something I have been wondering about... Why only 12xfaster? If we are effectively reading from memory - as compared to adisk reading at approximately 100MB/s (which is about an average PC HDDreading sequentially), I'd have thought it should be a lot faster than 12x.

Can we really only pull stuff from cache at only a little over onegigabyte per second if it's dedup data?


Cheers!

Nathan.


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Improving L1ARC cache efficiency with dedup

Reply via email to