Re: [zfs-discuss] Re: ZFS and databases

Nicolas Williams Fri, 12 May 2006 09:04:38 -0700

On Fri, May 12, 2006 at 05:23:53PM +0200, Roch Bourbonnais - Performance 
Engineering wrote:
> For read it is an interesting concept. Since
> 
>       Reading into cache
>       Then copy into user space
>       then keep data around but never use it
> 
> is not optimal. 
> So 2 issues, there is the cost of copy and there is the memory.
> 
> Now could we detect the pattern that cause holding to the
> cached block not optimal and do a quick freebehind after the 
> copyout ? Something like Random access +  very large file + poor cache hit
> ratio ?


An interface to request no caching on a per-file basis would be good
(madvise(2) should do for mmap'ed files, an fcntl(2) or open(2) flag
would be better).

> Now about avoiding the copy; That would mean dma straight
> into user space ? But if the checksum does not validate the
> data, what do we do ?

Who cares?  You DMA into user-space, check the checksum and if there's a
problem return an error; so there's [corrupted] data in the user space
buffer... but the app knows it, so what's the problem (see below)?

>                       If storage is not raid-protected and we
> have to return EIO, I don't think we can do this _and_
> corrupt the user buffer also, not sure what POSIX says for
> this situation.

If POSIX compliance is an issue just add new interfaces (possibly as
simple as an open(2) flag).

> Now latency wise, the cost of copy is  small compared to the
> I/O;  right ? So it now  turns into an  issue of saving some
> CPU cycles.

Can you build a system where the cost of the copy adds significantly to
the latency numbers?  (Think RAM disks.)

Nico
-- 
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Re: ZFS and databases

Reply via email to