> But is seems that when we're talking about full block
> writes (such as 
> sequential file writes) ZFS could do a bit better.
> 
> And as long as there is bandwidth left to the disk
> and the controllers, it 
> is difficult to argue that the work is redundant.  If
> it's free in that
> sense, it doesn't matter whether it is redundant.
>  But if it turns out NOT
> o have been redundant you save a lot.
> 

I think this is why an adaptive algorithm makes sense ... in situations where
frequent, progressive small writes are engaged by an application, the amount
of redundant disk access can be significant, and longer consolidation times
may make sense ... larger writes (>= the FS block size) would benefit less 
from longer consolidation times, and shorter thresholds could provide more
usable bandwidth

to get a sense of the issue here, I've done some write testing to previously
written files in a ZFS file system, and the choice of write element size
shows some big swings in actual vs data-driven bandwidth

when I launch a set of threads each of which writes 4KB buffers 
sequentially to its own file, I observe that for 60GB of application 
writes, the disks see 230+GB of IO (reads and writes): 
data-driven BW =~41MB/Sec (my 60GB in ~1500 Sec)
actual BW =~157 MB/Sec (the 230+GB in ~1500 Sec)

if I do the same writes with 128KB buffers (block size of my pool),
the same 60GBs of writes only generate 95GB of disk IO (reads and writes)
data-driven BW =~85MB/Sec (my 60GB in ~700 Sec)
actual BW =~134.6MB/Sec (the 95+GB in ~700 Sec)

in the first case, longer consolidation times would have lead to less total IO
and better data-driven BW, while in the second case shorter consolidation
times would have worked better

as far as redundant writes possibly occupying free bandwidth (and thus
costing nothing), I think you also have to consider the related costs of
additional block scavenging, and less available free space at any specific 
instant, possibly limiting the sequentiality of the next write ... of
course there's also the additional device stress

in any case, I agree with you that ZFS could do a better job in this area,
but it's not as simple as just looking for large or small IOs ...
sequential vs random access patterns also play a big role (as you point out)

I expect  (hope) the adaptive algorithms will mature over time, eventually
providing better behavior over a broader set of operating conditions
... Bill
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to