2012-03-21 21:40, Marion Hakanson цкщеу:
Small, random read performance does not scale with the number of drives in each
raidz vdev because of the dynamic striping. In order to read a single
logical block, ZFS has to read all the segments of that logical block, which
have been spread out across multiple drives, in order to validate the checksum
before returning that logical block to the application. This is why a single
vdev's random-read performance is equivalent to the random-read performance of
a single drive.
True, but if the stars align so nicely that all the sectors
related to the block are read simultaneously in parallel
from several drives of the top-level vdev, so there is no
(substantial) *latency* incurred by waiting between the first
and last drives to complete the read request, then the
*aggregate bandwidth* of the array is (should be) similar
to performance (bandwidth) of a stripe.
This gain would probably be hidden by caches and averages,
unless the stars align so nicely for many blocks in a row,
such as a sequential uninterrupted read of a file written
out sequentially - so that component drives would stream
it off the platter track by track in a row... Ah, what a
wonderful world that would be! ;)
Also, after the sector is read by the disk and passed to
the OS, it is supposedly cached until all sectors of the
block arrive into the cache and the checksum matches.
During this time the HDD is available to do other queued
mechanical tasks. I am not sure which cache that might be:
too early for ARC - no block yet, and the vdev-caches now
drop non-metadata sectors. Perhaps it is just a variable
buffer space in the instance of the reading routine which
tries to gather all pieces of the block together and pass
it to the reader (and into ARC)...
zfs-discuss mailing list