On 10/24/09 9:43 AM, Richard Elling wrote:

OK, here we see 4 I/Os pending outside of the host. The host has
sent them on and is waiting for them to return. This means they are
getting dropped either at the disk or somewhere between the disk
and the controller.

When this happens, the sd driver will time them out, try to clear
the fault by reset, and retry. In other words, the resets you see
are when the system tries to recover.

Since there are many disks with 4 stuck I/Os, I would lean towards
a common cause. What do these disks have in common? Firmware?
Do they share a SAS expander?

I saw this with my WD 500GB SATA disks (HDS725050KLA360) and LSI firmware 1.28.02.00 in IT mode, but I (almost?) always had exactly 1 "stuck" I/O. Note that my disks were one per channel, no expanders. I have _not_ seen it since replacing those disks. So my money is on a bug in the LSI firmware, the drive firmware, the drive controller hardware, or some combination thereof.

Note that LSI has released firmware 1.29.00.00. Sadly I cannot find any documentation on what has changed. Downloadable from LSI at http://lsi.com/storage_home/products_home/host_bus_adapters/sas_hbas/internal/sas3081e-r/index.html?remote=1&locale=EN

--
Carson




_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to