Pawel Jakub Dawidek <pjd <at> FreeBSD.org> writes: > > This is how RAIDZ fills the disks (follow the numbers): > > Disk0 Disk1 Disk2 Disk3 > > D0 D1 D2 P3 > D4 D5 D6 P7 > D8 D9 D10 P11 > D12 D13 D14 P15 > D16 D17 D18 P19 > D20 D21 D22 P23 > > D is data, P is parity.
This layout assumes of course that large stripes have been written to the RAIDZ vdev. As you know, the stripe width is dynamic, so it is possible for a single logical block to span only 2 disks (for those who don't know what I am talking about, see the "red" block occupying LBAs D3 and E3 on page 13 of these ZFS slides [1]). To read this logical block (and validate its checksum), only D_0 needs to be read (LBA E3). So in this very specific case, a RAIDZ read operation is as cheap as a RAID5 read operation. The existence of these small stripes could explain why RAIDZ doesn't perform as bad as RAID5 in Pawel's benchmark... [1] http://br.sun.com/sunnews/events/2007/techdaysbrazil/pdf/eric_zfs.pdf -marc _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss