Edward Ned Harvey writes:
>  > If you consider the extreme bias...  If the system would never give up
>  > metadata in cache until all the cached data were gone...  Then it would be
>  > similar to the current primarycache=metadata, except that the system would
>  > be willing to cache data too, whenever there was available cache otherwise
>  > going to waste.

I like this, and it could be another value for the same property:
metabias, metadata-bias, perfer-metadata, whatever. 

On Fri, Jun 03, 2011 at 06:25:45AM -0700, Roch wrote:
> Interesting. Now consider this :
> 
> We have an indirect block in memory (those are 16K
> referencing 128 individual data blocks). We also have an
> unrelated data block say 16K. Neither are currently being
> reference nor have they been for a long time (otherwise they
> move up to the head of the cache lists).  They reach the
> tail of the primary cache together. I have room for one of
> them in the secondary cache. 
> 
> Absent other information, do we think that the indirect
> block is more valuable than the data block ? At first I also
> wanted to say that metadata should be favored. Now I can't come
> up with an argument to favor either one. 

The effectiveness of a cache depends on the likelihood of a hit
against a cached value, vs the cost of keeping it.

Including data that may allow us to predict this future likelihood
based on past access patterns can improve this immensely. This is what
the arc algorithm does quite well.  

Absent this information, we assume the probability of future access to
all data blocks not currently in ARC is approximately equal.  The
indirect metadata block is therefore 127x as likely to be needed as
the one data block, since if any of the data blocks is needed, so will
the indirect block to find it.

> Therefore I think we need to include more information than just data
> vs metadata in the decision process.

If we have the information to hand, it may help - but we don't. 

The only thing I can think of we may have is whether either block was
ever on the "frequent" list, or only on the "recent" list, to catch
the single-pass sequential access pattern and make it the lower
priority for cache residence.

I don't know how feasible it is to check whether any of the blocks
referenced by the indirect block are themselves in arc, nor what that
might imply about the future likelihood of further accesses to other
blocks indirectly referenced by this one.

> Instant Poll : Yes/No ?

Yes for this as an RFE, or at least as a q&d implementation to measure
potential benefit.

--
Dan.

Attachment: pgp3K2k87cSZH.pgp
Description: PGP signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to