Thanks for the replies, some more questions follow.
Your answers below seem to contradict each other somewhat.
Is it true that:
1) VDEV cache before b70 used to contain a full copy
of prefetched disk contents,
2) VDEV cache since b70 analyzes the prefetched sectors
and only keeps metadata blocks,
3) VDEV cache since b148 is disabled by default?
So in fact currently we only have file-level "intelligent"
On my older systems I fired "kstat -p zfs:0:vdev_cache_stats"
and saw hit/miss ratios ranging from 30% to 70%. On the oi_148a
box I do indeed see all-zeros.
While I do understand the implications of VDEV-caching lots
of disks on systems with inadequate RAM, I tend to find this
feature useful on smaller systems - like home-NASes. It is
essentially free in terms of mechanical seeks, as well as
in RAM (what is 60-100Mb for a small box at home?) and any
nonzero hit ratio that speeds up the system seems justifiable ;)
I've tried playing with the options on my oi_148a LiveUSB
repair boot, and got varying results:
VDEV is indeed disabled by default, but can be enabled.
My system is scrubbing now, so it's got a few cache hits
(about 10%) right away.
root@openindiana:~# echo zfs_vdev_cache_size/W0t10000000 | mdb -kw
zfs_vdev_cache_size: 0 = 0x989680
root@openindiana:~# kstat -p zfs:0:vdev_cache_stats
However, trying to increase the prefetch size hung my system
almost immediately (in a couple of seconds). I'm away from
it now, so I'll ask for a photo of the console screen :)
root@openindiana:~# echo zfs_vdev_cache_max/W0t16384 | mdb -kw
zfs_vdev_cache_max: 0x4000 = 0x4000
root@openindiana:~# echo zfs_vdev_cache_bshift/W0t20 | mdb -kw
zfs_vdev_cache_bshift: 0x10 = 0x14
So there are deeper questions:
1) As of Illumos bug #175 (as well as OpenSolaris b148 and
if known - Solaris 11), is the vdev prefetch feature
*removed* from codebase ("no" as of oi_148a, what about
others?), or disabled by default (i.e. limit is preset
to 0, tune it yourself)?
2) If it is only disabled, are there solid plans to remove
it, or can we vote to keep it for those interested? :)
3) If the feature is present and gets enabled, how would
VDEV prefetch play along with file prefetch, again? ;)
4) Is there some tuneable (after b70) to enable prefetching
and keeping of user-data as well (not only metadata)?
Perhaps only so that I could test it with my use-patterns
to make sure that caching generic sectors is useless for
me, and I really should revert to caching only metadata?
5) Would it make sense to increase zfs_vdev_cache_bshift?
For example, when I tried to set it to 20 and prefetch
a whole 1MB of data, why would that cause the system
to die? Can it increase cache hit ratios (if works)?
6) Does the VDEV cache keep ZFS blocks or disk sectors?
For example, on my 4k disks the blocks are 4k, even
though there are a few hundred bytes worth of data in
metadata blocks and 3+KB of slack space.
7) Modern HDDs often have 32-64Mb DRAM cache onboard.
Is there any reason to match VDEV cache size with that
in any way (1:1, 2:1, etc)?
2012-01-09 6:06, Richard Elling wrote:
On Jan 8, 2012, at 5:10 PM, Jim Klimov wrote:
2012-01-09 4:14, Richard Elling пишет:
On Jan 7, 2012, at 8:59 AM, Jim Klimov wrote:
I wonder if it is possible (currently or in the future as an RFE)
to tell ZFS to automatically read-ahead some files and cache them
in RAM and/or L2ARC?
See discussions on the ZFS intelligent prefetch algorithm. I think Ben
description is the best general description:
And a more engineer-focused description is at:
Thanks for the pointers. While I've seen those articles
(in fact, one of the two non-spam comments in Ben's
blog was mine), rehashing the basics is always useful ;)
Still, how does VDEV prefetch play along with File-level
Trick question… it doesn't. vdev prefetching is disabled in opensolaris b148,
and Solaris 11 releases. The benefits of having the vdev cache for large
disks does not appear to justify the cost. See
For example, if ZFS prefetched 64K from disk
at the SPA level, and those sectors luckily happen to
contain "next" blocks of a streaming-read file, would
the file-level prefetch take the data from RAM cache
or still request them from the disk?
As of b70, vdev_cache only contains metadata. See
In what cases would it make sense to increase the
zfs_vdev_cache_size? Does it apply to all disks
combined, or to each disk (or even slice/partition)
It applies to each leaf vdev.
zfs-discuss mailing list