Hi guys,

There seems to have been some discussion about this before
(http://mail.opensolaris.org/pipermail/zfs-discuss/2006-September/013050.html)
but  I don't *quite* understand why the roundup is necessary.

Using Bill's notation, if there isn't a roundup (writing 4k fs blocks
to a 4 device RAID-Z) wouldn't you get something like this:

    Disk   0   1   2    3
--------------------
LBA  0   A.  A   A   A
        1   A.  A   A   A
        2   A.  A   A   B.
        3   B   B   B   B.
        4   B   B   B   B.
        5   B   B   C.  C
      etc.

Where writing the 4k of B blocks starts at offset 5.5k and finishes at 9k.
i.e for B (as seen in vdev_raidz_map_alloc):
f = 11 % 4 == column 3 (0 indexed)
q = 8 / (3) == 2
r = 8 - (2 * (4-1)) == 2
bc = 2 + 1 == 3
Then the for loop the first 3 columns (starting from column 3 and
wrapping) will be marked as big columns (storing an extra block each)
and the final column, column 2 will only store 2 blocks...  The
map_alloc function can then set the asize to be 5.5k and a subsequent
writes of C can start at offset 11.  Can anyone explain what it is I'm
missing?

Many thanks,

James

Reply via email to