Well yeah, this is obviously not a valid setup for my data, but if you read my 
first e-mail, the whole point of this test was that I had seen Solaris hang 
when a drive was removed from a fully redundant array (five sets of three way 
mirrors), and wanted to see what was going on.

So I started with the most basic pool I could to see how ZFS and Solaris 
actually reacted to a drive being removed.  I was fully expecting ZFS to simply 
error when the drive was removed, and move the test on to move complex pools.  
I did not expect to find so many problems with such a simple setup.  And the 
problems I have found also lead to potential data loss in a redundant array, 
although it would have been much more difficult to spot:

Imagine you had a raid-z array and pulled a drive as I'm doing here.  Because 
ZFS isn't aware of the removal it keeps writing to that drive as if it's valid. 
 That means ZFS still believes the array is online when in fact it should be 
degrated.  If any other drive now fails, ZFS will consider the status degrated 
instead of faulted, and will continue writing data.  The problem is, ZFS is 
writing some of that data to a drive which doesn't exist, meaning all that data 
will be lost on reboot.
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to