Hi guys, There seems to have been some discussion about this before (http://mail.opensolaris.org/pipermail/zfs-discuss/2006-September/013050.html) but I don't *quite* understand why the roundup is necessary.
Using Bill's notation, if there isn't a roundup (writing 4k fs blocks to a 4 device RAID-Z) wouldn't you get something like this: Disk 0 1 2 3 -------------------- LBA 0 A. A A A 1 A. A A A 2 A. A A B. 3 B B B B. 4 B B B B. 5 B B C. C etc. Where writing the 4k of B blocks starts at offset 5.5k and finishes at 9k. i.e for B (as seen in vdev_raidz_map_alloc): f = 11 % 4 == column 3 (0 indexed) q = 8 / (3) == 2 r = 8 - (2 * (4-1)) == 2 bc = 2 + 1 == 3 Then the for loop the first 3 columns (starting from column 3 and wrapping) will be marked as big columns (storing an extra block each) and the final column, column 2 will only store 2 blocks... The map_alloc function can then set the asize to be 5.5k and a subsequent writes of C can start at offset 11. Can anyone explain what it is I'm missing? Many thanks, James