Hi, you could try LSI itmpt driver as well, it seems to handle this better, 
although I think it only supports 8 devices at once or so.

You could also try more recent version of opensolaris (123 or even 126), as 
there seems to be a lot fixes regarding mpt-driver (which still seems to have 
issues).

Yours
Markus Kovero

-----Original Message-----
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of M P
Sent: 11. marraskuuta 2009 18:08
To: zfs-discuss@opensolaris.org
Subject: [zfs-discuss] ZFS on JBOD storage, mpt driver issue - server not 
responding

Server using [b]Sun StorageTek 8-port external SAS PCIe HBA [/b](mpt driver) 
connected to external JBOD array with 12 disks. 

Here is link to the exact SAS (Sun) adapter: 
http://www.sun.com/storage/storage_networking/hba/sas/PCIe.pdf  (LSI SAS3801)

When running IO intensive operations (zpool scrub) for couple of hours, the 
server locks with the following repeating messages:

Nov 10 16:31:45 sunserver scsi: [ID 365881 kern.info] 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:31:45 sunserver       Log info 0x31140000 received for target 17.
Nov 10 16:31:45 sunserver       scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Nov 10 16:32:55 sunserver scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:32:55 sunserver       Disconnected command timeout for Target 19
Nov 10 16:32:56 sunserver scsi: [ID 365881 kern.info] 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:32:56 sunserver       Log info 0x31140000 received for target 19.
Nov 10 16:32:56 sunserver       scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Nov 10 16:34:16 sunserver scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:34:16 sunserver       Disconnected command timeout for Target 21

I tested this on two servers:
- [b]Sun Fire X2200[/b] using [b]Sun Storage J4200 JBOD[/b] array and
- [b]Dell R410 Server[/b] with [b]Promise VTJ-310SS JBOD array[/b] 

They both are showing the same repeating messages and locking after couple of 
hours of zpool scrub.

Solaris appears to be more stable (than OpenSolaris) - it doesn't lock when 
scrubbing, but still locks after 5-6 hours reading from the JBOD array - 10TB 
size.

So at this point this looks like an issue with the MPT driver or these SAS 
cards (I tested two) when under heavy load. I put the latest firmware for the 
SAS card from LSI's web site - v1.29.00 without any changes, server still locks.

Any ideas, suggestions how to fix or workaround this issue? The adapter is 
suppose to be enterprise-class.

Here is more detailed log info:
========================================================
Sun Fire X2200 and Sun Storage J4200 JBOD array

SAS card: Sun StorageTek 8-port external SAS PCIe HBA

http://www.sun.com/storage/storage_networking/hba/sas/PCIe.pdf  (LSI SAS3801)

Operation System: SunOS sunserver 5.11 snv_111b i86pc i386 i86pc Solaris

Nov 10 16:30:33 sunserver scsi: [ID 365881 kern.info] 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:30:33 sunserver       Log info 0x31140000 received for target 0.
Nov 10 16:30:33 sunserver       scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Nov 10 16:31:43 sunserver scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:31:43 sunserver       Disconnected command timeout for Target 17
Nov 10 16:32:55 sunserver scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:32:55 sunserver       Disconnected command timeout for Target 19
Nov 10 16:32:56 sunserver scsi: [ID 365881 kern.info] 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:32:56 sunserver       Log info 0x31140000 received for target 19.
Nov 10 16:32:56 sunserver       scsi_status=0x0, ioc_status=0x8048, 
scsi_state=0xc
Nov 10 16:34:16 sunserver scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci10de,3...@a/pci1000,3...@0 (mpt0):
Nov 10 16:34:16 sunserver       Disconnected command timeout for Target 21

----------------
Dell R410 Server and Promise VTJ-310SS JBOD array

SAS card: Sun StorageTek 8-port external SAS PCIe HBA

Operating System: SunOS dellserver 5.10 Generic_141445-09 i86pc i386 i86pc

Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci8086,3...@3/pci1028,1...@0 (mpt0):
Nov 11 00:18:22 dellserver         Disconnected command timeout for Target 0
Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci8086,3...@3/pci1028,1...@0/s...@0,0 (sd13):
Nov 11 00:18:22 dellserver         Error for Command: read(10)                
Error Level: Retryable
Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice]   Requested Block: 
276886498                 Error Block: 276886498
Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice]   Vendor: Dell         
                      Serial Number: Dell Interna
Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice]   Sense Key: Unit 
Attention
Nov 11 00:18:22 dellserver scsi: [ID 107833 kern.notice]   ASC: 0x29 (power on, 
reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0
Nov 11 00:19:33 dellserver scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci8086,3...@3/pci1028,1...@0 (mpt0):
Nov 11 00:19:33 dellserver         Disconnected command timeout for Target 0
Nov 11 00:19:34 dellserver scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci8086,3...@3/pci1028,1...@0/s...@0,0 (sd13):
Nov 11 00:19:34 dellserver         SCSI transport failed: reason 'reset': 
retrying command
Nov 11 00:20:44 dellserver scsi: [ID 107833 kern.warning] WARNING: 
/p...@0,0/pci8086,3...@3/pci1028,1...@0 (mpt0):
Nov 11 00:20:44 dellserver         Disconnected command timeout for Target 0
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to