> On Wed, Nov 07, 2007 at 01:47:04PM -0800, can you
> guess? wrote:
> > I do consider the RAID-Z design to be somewhat
> brain-damaged [...]
> 
> How so? In my opinion, it seems like a cure for the
> brain damage of RAID-5.

Nope.

A decent RAID-5 hardware implementation has no 'write hole' to worry about, and 
one can make a software implementation similarly robust with some effort (e.g., 
by using a transaction log to protect the data-plus-parity double-update or by 
using COW mechanisms like ZFS's in a more intelligent manner).

The part of RAID-Z that's brain-damaged is its 
concurrent-small-to-medium-sized-access performance (at least up to request 
sizes equal to the largest block size that ZFS supports, and arguably somewhat 
beyond that):  while conventional RAID-5 can satisfy N+1 small-to-medium read 
accesses or (N+1)/2 small-to-medium write accesses in parallel (though the 
latter also take an extra rev to complete), RAID-Z can satisfy only one 
small-to-medium access request at a time (well, plus a smidge for read accesses 
if it doesn't verity the parity) - effectively providing RAID-3-style 
performance.

The easiest way to fix ZFS's deficiency in this area would probably be to map 
each group of N blocks in a file as a stripe with its own parity - which would 
have the added benefit of removing any need to handle parity groups at the disk 
level (this would, incidentally, not be a bad idea to use for mirroring as 
well, if my impression is correct that there's a remnant of LVM-style internal 
management there).  While this wouldn't allow use of parity RAID for very small 
files, in most installations they really don't occupy much space compared to 
that used by large files so this should not constitute a significant drawback.

- bill
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to