I have a hypothetical question regarding ZFS reduplication.  Does the L1ARC cache benefit from reduplication
in the sense that the L1ARC will only need to cache one copy of the reduplicated data versus many copies?  
Here is an example:

Imagine that I have a server with 2TB of RAM and a PB of disk storage.  On this server I create a single 1TB 
data file that is full of unique data.  Then I make 9 copies of that file giving each file a unique name and 
location within the same ZFS zpool.  If I start up 10 application instances where each application reads all of 
its own unique copy of the data, will the L1ARC contain only the deduplicated data or will it cache separate 
copies the data from each file?  In simpler terms, will the L1ARC require 10TB of RAM or just 1TB of RAM to 
cache all 10 1TB files worth of data?

My hope is that since the data only physically occupies 1TB of storage via deduplication that the L1ARC
will also only require 1TB of RAM for the data.

Note that I know the deduplication table will use the L1ARC as well.  However, the focus of my question
is on how the L1ARC would benefit from a data caching standpoint.

Thanks in advance!

Brad Diggs | Principal Sales Consultant

zfs-discuss mailing list

Reply via email to