Peter Taps wrote:
Hi Eric,

Thank you for your help. At least one part is clear now.

I still am confused about how the system is still functional after one disk 
fails.

Consider my earlier example of 3 disks zpool configured for raidz-1. To keep it 
simple let's not consider block sizes.

Let's say I send a write value "abcdef" to the zpool.

As the data gets striped, we will have 2 characters per disk.

disk1 = "ab" + some parity info
disk2 = "cd" + some parity info
disk3 = "ef" + some parity info

Now, if disk2 fails, I lost "cd." How will I ever recover this? The parity info 
may tell me that something is bad but I don't see how my data will get recovered.

The only good thing is that any newer data will now be striped over two disks.

Perhaps I am missing some fundamental concept about raidz.

Regards,
Peter

It's done via math and numbers. :) In a computer, everything is numbers, stored in base 2 (binary)...there are no letters or other symbols. Your sample value of 'abcdef' will be represented as a sequence of numbers, probably using the ASCII equivalent numbers, which are in turn represented as a binary sequence.

A simplified view of how you can protect multiple independent pieces of information with once piece of parity is as follows. (Note: this simplified view is not exactly how RAID5 or RAIDZ work, as they actually make use of XOR at a bitwise level).

Consider an equation with variables (unrelated to your sample value) A, B, and P, where A + B = P. P is the parity value. A and B are numbers representing your data; they were indirectly chosen by you when you created your data. P is the generated parity value.

If A=97, and B=98, then P=97+98=195.

Each of the three variables is stored on a different disk. If any one variable is lost (the disk failed), the missing variable can be recalculated by rearranging the formula and using the known values.

Assuming 'A' was lost, then A=P-B
P-B=195-98
195-98=97
A=97.  Data recovered.

In this simplified example, one piece of parity data P is generated for every pair of A and B values that are written. Special cases handle things when only one value needs to be written (zero padding). For more than 3 disks, the formula can expand to variations of A+B+C+D+E+F=P where P is the parity. Additional levels of parity require using more complex techniques to generate the needed parity values.

There are lots of other explanations online that might help you out as well: http://www.google.com/#hl=en&q=how+raid+works

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to