Tomas Ögren wrote:
> Hello.
>
> Executive summary: I want arc_data_limit (like arc_meta_limit, but for
> data) and set it to 0.5G or so. Is there any way to "simulate" it?
>   

We describe how to limit the size of the ARC cache in the Evil Tuning Guide.
http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide
 -- richard

> We have a cluster of linux frontends (http/ftp/rsync) for
> Debian/Mozilla/etc archives and as a NFS disk backend we currently have
> a DL145 running OpenSolaris (snv98) with one pool of 3 raidz2 with 10
> SATA disks each.
>
> The frontends have a local cache of a few raid0'd disks, but need to dip
> into the backend every now and then because they don't fit all the data.
> Rsyncs dip into the backend for filesystem traversal, both when pulling
> (for us: writing) data and sending to others.
>
> Obviously, the working data set (4TB or so right now) is quite a lot
> larger than the RAM on the disk backend (8GB), so data disk cache is
> mostly useless. Metadata cache is good, because rsync checks through the
> tree every now and then and that's the only part that has a chance of
> fitting in RAM (about 1.5-2M files now).
>
> So I want to dedicate as much ram as possible to metadata cache and data
> cache is of less importance.
>
> Right now, ZFS has a knob so you can limit the amount of metadata cache
> (arc_meta_limit), but not limit the amount of data cache.
>
> I've tried to do a few tuning tricks, but they all seem to have
> drawbacks.
>
> * zfs set primarycache=metadata myfs
> - If record size is set to 128k and I get an application to read 32k,
>   then ZFS reads 128k, hands over 32k to the app and throws away 96k.
>   Repeat. (so I get 400% physical IO over logical IO). If I tune down
>   recordsize to 32k, then each disk will get 32/8 (8 data disks per
>   raidz2) = 4k IOs, which isn't all optimal. 4k/IO * 100 IOPS * 8 disks
>   * 3 raidz2's = 9600kB/s. I would prefer if each disk got larger IO
>   blocks than with rs=128k too..
>
> * zfs:zfs_prefetch_disable
> - prefetch in small amounts could be good, but with a limit on how much
>   it will keep of it. It uses up precious ram that I want for metadata
>   cache.
>
> * zfs:arc_meta_limit
> - I'm raising it to about the size of the whole ARC. But I want
>   arc_data_limit and set it to 512M or so, just for temporary buffers.
>
> * ncsize
> - I want to keep this high, but with ZFS it seems to use huge amounts of
>   data per dnode_t/zfs_znode_cache or something..
>
> I think most of the performance issues will be solved if I let ZFS do
> all of its prefetching, but limit the amount of data.
>
> Wouldn't this problem arise on most file servers?
>
>
> When checking ::arc / ::kmastat / ::memstat, I usually see close to 1GB
> in freelist (probably due to c_max being ~7G), ZFS File Data at ~0 (when
> doing primarycache=metadata) and still arc_meta_used is only about 2GB..
> where's the other 5GB of 'Kernel' memory go?
>
> Large consumers in ::kmastat are:
> dnode_t                      656 1260774 1540038 1051332608B  28987865     0
> rnode4_cache                 968 1000000 1000000 1024000000B   1000000     0
> kmem_va_16384              16384  50389  59952 982253568B  68595231     0
> kmem_va_4096                4096 1247328 1268576 901120000B   5788758     0
> zio_buf_16384              16384  50413  50437 826359808B 251118744     0
> zio_buf_512                  512 1255757 1502096 769073152B  95757654     0
> vn_cache                     200 2022348 2780205 759181312B  12796286     0
> kmem_va_8192                8192  14923  80336 658112512B   1239789     0
> zio_buf_65536              65536   5156   5160 338165760B  49769732     0
> dmu_buf_impl_t               192 1306533 1594440 326541312B 103503254     0
>
>
> Could I do some trickery with creating a 5-6GB ramdisk, setting that as
> L2 secondarycache=metadata and primarycache=all ? Or would it be better 
> (due to how ZFS migrates data from prim to sec) to have
> primarycache=metadata and secondarycache=all with the L2 ramdisk?
> How does ZFS currently like if the L2 is blank/missing at boot?
>
> Maybe these trickery will starve the DNLC too though..
>
> /Tomas
>   

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to