>>> My question is, there is anyway to anticipate this "choking" situation
> when a
>>> disk is failing, to avoid the general failure?
>> No.
> Yes.


> But not necessarily using the setup that you are currently using - that is
> not quite clear from your original email.
> If you have 4 HBA's, you want to arrange your raid such that you could
> survive the complete loss of the entire HBA.  This would mean you build your
> pool out of a bunch of 4-disk raidz vdev's, or perhaps a bunch of 8-disk
> raidz2 vdev's.
> The whole problem you're facing is that some bad disk brings down the whole
> bus with it...  Make your redundancy able to survive the loss of a bus.

We've had luck eliminating expanders from the design, too :-)

But this is also one of those cases where the failure results in a "wounded
soldier" case -- not dead, but not able to keep fighting effectively. The result
is a massive slowdown of the system that can be best described as a DoS
 -- richard

