So, while we are working on resolving this issue with Sun, let me approach this 
from the another perspective: what kind of controller/drive ratio would be the 
minimum recommended to support a functional OpenSolaris-based archival 
solution? Given the following:

- the vast majority of IO to the system is going to be "read" oriented, other 
than the initial "load" of the archive shares and possibly scrubs/re-silvering 
in the case of failed drives
- we currently have one LSISAS3801E with two external ports; each port connects 
to one 23-disk JBOD
- Each JBOD has the ability to take in two external SAS connections if we 
enable the "split-backplane" option on it which would split the disk IO path 
between the two connectors (12 disks on one connector, 11 on the other); we do 
not currently have this enabled
- our current server platform only has 1 x PCIe-x8 slot available; we *could* 
look at changing this in the future, but I'd prefer to find a one-card solution 
if possible

Here is the math I did that shows the current IO situation (PLEASE correct this 
if I am mistaken, as I am somewhat "winging" it here and my head hurts) :

Based on info from:

http://storageadvisors.adaptec.com/2006/07/26/sas-drive-performance/
http://en.wikipedia.org/wiki/PCI_Express
http://support.wdc.com/product/kb.asp?modelno=WD1002FBYS&x=9&y=8

WD1002FBYS 1TB SATA2 7200rpm drive specs
Avg seek time = 8.9ms
Avg latency = 4.2ms
Max transfer speed = 112 MB/s
Avg transfer speed ~= 65 MB/s

"Random" IO scenario (theoretical numbers):
8.9ms avg seek time + 4.2ms avg latency = 13.1 ms avg access time
1/0.0131 = 76 IOPS/drive
22 (23 - 1 spare) drives x 76 IOPS/drive = 1672 IOPS/shelf
1672 IOPS/shelf x 2 = 3344 IOPS/controller
-or-
22 (23 - 1 spare) drives x 65 MB/s/drive = 1430 MB/s/shelf
1430 MB/s/shelf x 2 = 2860 MB/s controller

Pure "streamed read" IO scenario  (theoretical numbers):
0.0 avg seek time + 4.2ms avg latency = 4.2 ms avg access time
1/0.0042 = 238 IOPS/drive
22 (23 - 1 spare) drives x 238 IOPS/drive = 5236 IOPS/shelf
5236 IOPS/shelf x 2 = 10472 IOPS/controller
-or-
22 (23 - 1 spare) drives x 112 MB/s/drive = 2464 MB/s/shelf
2464 MB/s/shelf x 2 = 4928 MB/s controller

Max. bandwith of single SAS PHY interface = 270MB/s per port (300MB/s -
overhead)

LSISAS3801E has 2 x 4-port SAS connections. Each shelf gets a 4-port
connection, so:

Max controller bandwidth/shelf = 4 x 270 MB/s = 1080 MB/s
Max controller bandwidth = 2 x 1080 MB/s = 2160 MB/s

Max. bandwidth of PCIe x8 interface = 2GB/s
Typical sustained bandwidth of PCIe x8 interface (max - 5% overhead)=
1.9GB/s

Summary:

Current controller cannot handle max IO load of even random IO scenario
(1430 MB/s per shelf needed, controller can only handle 1080 MB/s per
shelf). Also, PCIe bus can't push more than 1.9 GB/s sustained over a
single slot, so we are limited by the single card.

Solution:

Connecting 2 x 4-port SAS connectors to one shelf (i.e. enabling split-mode) 
would get us 2160 MB/s
/ shelf. This would allow us to remove the controller as a bottleneck
for all but the extreme cached read scenario, but the PCIe bus would
still throttle us to 1.9 GB/s per slot. So, the controller could keep up
with the shelves, but the PCIe bus would have to wait sometimes which
may (?) be a "healthier" situation than overwhelming the controller.

To support two shelves per controller, we could use a LSISAS31601E (4 x 4-port 
SAS connectors) but we would hit the PCIe bus limitation again. Moving to two 
(or more?) separate PCIe-x8 cards would be best, but we require us to alter our 
server platform.

Whew. Thoughts? Comments? Suggestions?
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to