Tomas Ögren wrote: > Hello. > > Executive summary: I want arc_data_limit (like arc_meta_limit, but for > data) and set it to 0.5G or so. Is there any way to "simulate" it? >
We describe how to limit the size of the ARC cache in the Evil Tuning Guide. http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide -- richard > We have a cluster of linux frontends (http/ftp/rsync) for > Debian/Mozilla/etc archives and as a NFS disk backend we currently have > a DL145 running OpenSolaris (snv98) with one pool of 3 raidz2 with 10 > SATA disks each. > > The frontends have a local cache of a few raid0'd disks, but need to dip > into the backend every now and then because they don't fit all the data. > Rsyncs dip into the backend for filesystem traversal, both when pulling > (for us: writing) data and sending to others. > > Obviously, the working data set (4TB or so right now) is quite a lot > larger than the RAM on the disk backend (8GB), so data disk cache is > mostly useless. Metadata cache is good, because rsync checks through the > tree every now and then and that's the only part that has a chance of > fitting in RAM (about 1.5-2M files now). > > So I want to dedicate as much ram as possible to metadata cache and data > cache is of less importance. > > Right now, ZFS has a knob so you can limit the amount of metadata cache > (arc_meta_limit), but not limit the amount of data cache. > > I've tried to do a few tuning tricks, but they all seem to have > drawbacks. > > * zfs set primarycache=metadata myfs > - If record size is set to 128k and I get an application to read 32k, > then ZFS reads 128k, hands over 32k to the app and throws away 96k. > Repeat. (so I get 400% physical IO over logical IO). If I tune down > recordsize to 32k, then each disk will get 32/8 (8 data disks per > raidz2) = 4k IOs, which isn't all optimal. 4k/IO * 100 IOPS * 8 disks > * 3 raidz2's = 9600kB/s. I would prefer if each disk got larger IO > blocks than with rs=128k too.. > > * zfs:zfs_prefetch_disable > - prefetch in small amounts could be good, but with a limit on how much > it will keep of it. It uses up precious ram that I want for metadata > cache. > > * zfs:arc_meta_limit > - I'm raising it to about the size of the whole ARC. But I want > arc_data_limit and set it to 512M or so, just for temporary buffers. > > * ncsize > - I want to keep this high, but with ZFS it seems to use huge amounts of > data per dnode_t/zfs_znode_cache or something.. > > I think most of the performance issues will be solved if I let ZFS do > all of its prefetching, but limit the amount of data. > > Wouldn't this problem arise on most file servers? > > > When checking ::arc / ::kmastat / ::memstat, I usually see close to 1GB > in freelist (probably due to c_max being ~7G), ZFS File Data at ~0 (when > doing primarycache=metadata) and still arc_meta_used is only about 2GB.. > where's the other 5GB of 'Kernel' memory go? > > Large consumers in ::kmastat are: > dnode_t 656 1260774 1540038 1051332608B 28987865 0 > rnode4_cache 968 1000000 1000000 1024000000B 1000000 0 > kmem_va_16384 16384 50389 59952 982253568B 68595231 0 > kmem_va_4096 4096 1247328 1268576 901120000B 5788758 0 > zio_buf_16384 16384 50413 50437 826359808B 251118744 0 > zio_buf_512 512 1255757 1502096 769073152B 95757654 0 > vn_cache 200 2022348 2780205 759181312B 12796286 0 > kmem_va_8192 8192 14923 80336 658112512B 1239789 0 > zio_buf_65536 65536 5156 5160 338165760B 49769732 0 > dmu_buf_impl_t 192 1306533 1594440 326541312B 103503254 0 > > > Could I do some trickery with creating a 5-6GB ramdisk, setting that as > L2 secondarycache=metadata and primarycache=all ? Or would it be better > (due to how ZFS migrates data from prim to sec) to have > primarycache=metadata and secondarycache=all with the L2 ramdisk? > How does ZFS currently like if the L2 is blank/missing at boot? > > Maybe these trickery will starve the DNLC too though.. > > /Tomas > _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss