Pawel Jakub Dawidek <pjd <at> FreeBSD.org> writes:
> 
> This is how RAIDZ fills the disks (follow the numbers):
> 
>       Disk0   Disk1   Disk2   Disk3
> 
>       D0      D1      D2      P3
>       D4      D5      D6      P7
>       D8      D9      D10     P11
>       D12     D13     D14     P15
>       D16     D17     D18     P19
>       D20     D21     D22     P23
> 
> D is data, P is parity.

This layout assumes of course that large stripes have been written to
the RAIDZ vdev. As you know, the stripe width is dynamic, so it is
possible for a single logical block to span only 2 disks (for those who
don't know what I am talking about, see the "red" block occupying LBAs
D3 and E3 on page 13 of these ZFS slides [1]).

To read this logical block (and validate its checksum), only D_0 needs 
to be read (LBA E3). So in this very specific case, a RAIDZ read
operation is as cheap as a RAID5 read operation. The existence of these
small stripes could explain why RAIDZ doesn't perform as bad as RAID5
in Pawel's benchmark...

[1] http://br.sun.com/sunnews/events/2007/techdaysbrazil/pdf/eric_zfs.pdf

-marc


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to