Nicolas Williams writes: > On Fri, May 12, 2006 at 05:23:53PM +0200, Roch Bourbonnais - Performance > Engineering wrote: > > For read it is an interesting concept. Since > > > > Reading into cache > > Then copy into user space > > then keep data around but never use it > > > > is not optimal. > > So 2 issues, there is the cost of copy and there is the memory. > > > > Now could we detect the pattern that cause holding to the > > cached block not optimal and do a quick freebehind after the > > copyout ? Something like Random access + very large file + poor cache hit > > ratio ? > > An interface to request no caching on a per-file basis would be good > (madvise(2) should do for mmap'ed files, an fcntl(2) or open(2) flag > would be better). > > > Now about avoiding the copy; That would mean dma straight > > into user space ? But if the checksum does not validate the > > data, what do we do ? > > Who cares? You DMA into user-space, check the checksum and if there's a > problem return an error; so there's [corrupted] data in the user space > buffer... but the app knows it, so what's the problem (see below)? > > > If storage is not raid-protected and we > > have to return EIO, I don't think we can do this _and_ > > corrupt the user buffer also, not sure what POSIX says for > > this situation. > > If POSIX compliance is an issue just add new interfaces (possibly as > simple as an open(2) flag). > > > Now latency wise, the cost of copy is small compared to the > > I/O; right ? So it now turns into an issue of saving some > > CPU cycles. > > Can you build a system where the cost of the copy adds significantly to > the latency numbers? (Think RAM disks.) > > Nico > --
Finally I can agree with somebody today. Directio is non-posix anyway and given that people have been train to inform the system that the cache won't be useful, that it's a hard problem to detect automatically, let's avoid the copy and save memory all at once for the read path. We could use the directio() call for that ... -r _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss