On 2013-01-19 23:39, Richard Elling wrote:
This is not quite true for raidz. If there is a 4k write to a raidz
comprised of 4k sector disks, then
there will be one data and one parity block. There will not be 4 data +
1 parity with 75%
space wastage. Rather, the space allocation more closely resembles a
variant of mirroring,
like some vendors call "RAID-1E"
I agree with this exact reply, but as I posted sometime late last year,
reporting on my "digging in the bowels of ZFS" and my problematic pool,
for a 6-disk raidz2 set I only saw allocations (including two parity
disks) divisible by 3 sectors, even if the amount of the (compressed)
userdata was not so rounded. I.e. I had either miniature files or tails
of files fitting into one sector plus two parities (overall a 3 sector
allocation), or tails ranging 2-4 sectors and occupying 6 with parity
(while 2 or 3 sectors could use just 4 or 5 w/parities, respectively).
I am not sure what these numbers mean - 3 being a case for "one userdata
sector plus both parities" or for "half of 6-disk stripe" - both such
explanations fit in my case.
But yes, with current raidz allocation there are many ways to waste
space. And those small percentages (or not so small) do add up.
Rectifying this example, i.e. allocating only as much as is used,
does not seem like an incompatible on-disk format change, and should
be doable within the write-queue logic. Maybe it would cause tradeoffs
in efficiency; however, ZFS does explicitly "rotate" starting disks
of allocations every few megabytes in order to even out the loads
among spindles (normally parity disks don't have to be accessed -
unless mismatches occur on data disks). Disabling such padding would
only help achieve this goal and save space at the same time...
zfs-discuss mailing list