Peter Taps wrote:
Hi Eric,
Thank you for your help. At least one part is clear now.
I still am confused about how the system is still functional after one disk
fails.
Consider my earlier example of 3 disks zpool configured for raidz-1. To keep it
simple let's not consider block sizes.
Let's say I send a write value "abcdef" to the zpool.
As the data gets striped, we will have 2 characters per disk.
disk1 = "ab" + some parity info
disk2 = "cd" + some parity info
disk3 = "ef" + some parity info
Now, if disk2 fails, I lost "cd." How will I ever recover this? The parity info
may tell me that something is bad but I don't see how my data will get recovered.
The only good thing is that any newer data will now be striped over two disks.
Perhaps I am missing some fundamental concept about raidz.
Regards,
Peter
It's done via math and numbers. :) In a computer, everything is
numbers, stored in base 2 (binary)...there are no letters or other
symbols. Your sample value of 'abcdef' will be represented as a
sequence of numbers, probably using the ASCII equivalent numbers, which
are in turn represented as a binary sequence.
A simplified view of how you can protect multiple independent pieces of
information with once piece of parity is as follows.
(Note: this simplified view is not exactly how RAID5 or RAIDZ work, as
they actually make use of XOR at a bitwise level).
Consider an equation with variables (unrelated to your sample value) A,
B, and P, where A + B = P. P is the parity value.
A and B are numbers representing your data; they were indirectly chosen
by you when you created your data. P is the generated parity value.
If A=97, and B=98, then P=97+98=195.
Each of the three variables is stored on a different disk. If any one
variable is lost (the disk failed), the missing variable can be
recalculated by rearranging the formula and using the known values.
Assuming 'A' was lost, then A=P-B
P-B=195-98
195-98=97
A=97. Data recovered.
In this simplified example, one piece of parity data P is generated for
every pair of A and B values that are written. Special cases handle
things when only one value needs to be written (zero padding). For more
than 3 disks, the formula can expand to variations of A+B+C+D+E+F=P
where P is the parity. Additional levels of parity require using more
complex techniques to generate the needed parity values.
There are lots of other explanations online that might help you out as
well: http://www.google.com/#hl=en&q=how+raid+works
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss