On 5/12/11 02:50 AM, Ryan Wehler wrote:
In an effort to solve this problem I did update my 3442E-R HBAs from a
2009 firmware to "Phase 21" which came out earlier this year from LSI. The
replacement backplane I got from my VAR when they thought that was the
issue moved the backplane firmware from 7015 to 7017 per lsiutil's output.
You're right it must be a physical issue but it just seems highly unlikely
that BOTH HBAs failed and BOTH SAS cables failed (we'll take the expander
out of the equation since it was replaced)
You need to look at the data available, rather than making
assumptions. When I was part of CPRE (now PTS?) in Sun we
referred to swapping hardware without investigation as
practicing "swaptronics". Every escalation we got where this
had happened took longer to resolve as a result.
So yes, it certainly could be a hardware problem twice in a
row. You'd want to examine the serial numbers and other identifying
data such as manufacturing date codes to see how likely that is.
In the past I've seen cases where replacement disks turned out to
be duds across several different batches and different factories
involved. The true root cause was traced to a chip that was supplied
to the manufacturer by a third party.
Personally, I'd start looking at the cables first - in my
experience they seem to incur more physical stress through the
connect/disconnect operations than HBAs.
James C. McPherson
zfs-discuss mailing list