The iostat I posted previously was from a system we had already tuned the zfs:zfs_vdev_max_pending depth down to 10 (as visible by the max of about 10 in actv per disk).
I reset this value in /etc/system to 7, rebooted, and started a scrub. iostat output showed busier disks (%b is higher, which seemed odd) but a cap of about 7 queue items per disk, proving the tuning was effective. iostat at a high-water mark during the test looked like this: extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c8t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c8t1d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c8t2d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c8t3d0 8344.5 0.0 359640.4 0.0 0.1 300.5 0.0 36.0 0 4362 c9 190.0 0.0 6800.4 0.0 0.0 6.6 0.0 34.8 0 99 c9t8d0 185.0 0.0 6917.1 0.0 0.0 6.1 0.0 32.9 0 94 c9t9d0 187.0 0.0 6640.9 0.0 0.0 6.5 0.0 34.6 0 98 c9t10d0 186.5 0.0 6543.4 0.0 0.0 7.0 0.0 37.5 0 100 c9t11d0 180.5 0.0 7203.1 0.0 0.0 6.7 0.0 37.2 0 100 c9t12d0 195.5 0.0 7352.4 0.0 0.0 7.0 0.0 35.8 0 100 c9t13d0 188.0 0.0 6884.9 0.0 0.0 6.6 0.0 35.2 0 99 c9t14d0 204.0 0.0 6990.1 0.0 0.0 7.0 0.0 34.3 0 100 c9t15d0 199.0 0.0 7336.7 0.0 0.0 7.0 0.0 35.2 0 100 c9t16d0 180.5 0.0 6837.9 0.0 0.0 7.0 0.0 38.8 0 100 c9t17d0 198.0 0.0 7668.9 0.0 0.0 7.0 0.0 35.3 0 100 c9t18d0 203.0 0.0 7983.2 0.0 0.0 7.0 0.0 34.5 0 100 c9t19d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c9t20d0 195.5 0.0 7096.4 0.0 0.0 6.7 0.0 34.1 0 98 c9t21d0 189.5 0.0 7757.2 0.0 0.0 6.4 0.0 33.9 0 97 c9t22d0 195.5 0.0 7645.9 0.0 0.0 6.6 0.0 33.8 0 99 c9t23d0 194.5 0.0 7925.9 0.0 0.0 7.0 0.0 36.0 0 100 c9t24d0 188.5 0.0 6725.6 0.0 0.0 6.2 0.0 32.8 0 94 c9t25d0 188.5 0.0 7199.6 0.0 0.0 6.5 0.0 34.6 0 98 c9t26d0 196.0 0.0 6666.9 0.0 0.0 6.3 0.0 32.1 0 95 c9t27d0 193.5 0.0 7455.4 0.0 0.0 6.2 0.0 32.0 0 95 c9t28d0 189.0 0.0 7400.9 0.0 0.0 6.3 0.0 33.2 0 96 c9t29d0 182.5 0.0 9397.0 0.0 0.0 7.0 0.0 38.3 0 100 c9t30d0 192.5 0.0 9179.5 0.0 0.0 7.0 0.0 36.3 0 100 c9t31d0 189.5 0.0 9431.8 0.0 0.0 7.0 0.0 36.9 0 100 c9t32d0 187.5 0.0 9082.0 0.0 0.0 7.0 0.0 37.3 0 100 c9t33d0 188.5 0.0 9368.8 0.0 0.0 7.0 0.0 37.1 0 100 c9t34d0 180.5 0.0 9332.8 0.0 0.0 7.0 0.0 38.8 0 100 c9t35d0 183.0 0.0 9690.3 0.0 0.0 7.0 0.0 38.2 0 100 c9t36d0 186.0 0.0 9193.8 0.0 0.0 7.0 0.0 37.6 0 100 c9t37d0 180.5 0.0 8233.4 0.0 0.0 7.0 0.0 38.8 0 100 c9t38d0 175.5 0.0 9085.2 0.0 0.0 7.0 0.0 39.9 0 100 c9t39d0 177.0 0.0 9340.0 0.0 0.0 7.0 0.0 39.5 0 100 c9t40d0 175.5 0.0 8831.0 0.0 0.0 7.0 0.0 39.9 0 100 c9t41d0 190.5 0.0 9177.8 0.0 0.0 7.0 0.0 36.7 0 100 c9t42d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c9t43d0 196.0 0.0 9180.5 0.0 0.0 7.0 0.0 35.7 0 100 c9t44d0 193.5 0.0 9496.8 0.0 0.0 7.0 0.0 36.2 0 100 c9t45d0 187.0 0.0 8699.5 0.0 0.0 7.0 0.0 37.4 0 100 c9t46d0 198.5 0.0 9277.0 0.0 0.0 7.0 0.0 35.2 0 100 c9t47d0 185.5 0.0 9778.3 0.0 0.0 7.0 0.0 37.7 0 100 c9t48d0 192.0 0.0 8384.2 0.0 0.0 7.0 0.0 36.4 0 100 c9t49d0 198.5 0.0 8864.7 0.0 0.0 7.0 0.0 35.2 0 100 c9t50d0 192.0 0.0 9369.8 0.0 0.0 7.0 0.0 36.4 0 100 c9t51d0 182.5 0.0 8825.7 0.0 0.0 7.0 0.0 38.3 0 100 c9t52d0 202.0 0.0 7387.9 0.0 0.0 7.0 0.0 34.6 0 100 c9t55d0 ...and sure enough about 20 minutes into it I get this (bus reset?): scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4/pci1000,3...@0/s...@34,0 (sd49): incomplete read- retrying scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4/pci1000,3...@0/s...@21,0 (sd30): incomplete read- retrying scsi: [ID 107833 kern.warning] WARNING: /p...@0,0/pci8086,6...@4/pci1000,3...@0/s...@1e,0 (sd27): incomplete read- retrying scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci1000,3...@0 (mpt0): Rev. 8 LSI, Inc. 1068E found. scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci1000,3...@0 (mpt0): mpt0 supports power management. scsi: [ID 365881 kern.info] /p...@0,0/pci8086,6...@4/pci1000,3...@0 (mpt0): mpt0: IOC Operational. During the "bus reset", iostat output looked like this: extended device statistics ---- errors --- r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b s/w h/w trn tot device 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c8 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c8t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c8t1d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c8t2d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c8t3d0 0.0 0.0 0.0 0.0 0.0 88.0 0.0 0.0 0 2200 0 3 0 3 c9 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t8d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t9d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t10d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t11d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t12d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t13d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t14d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t15d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t16d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t17d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t18d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t19d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t20d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t21d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t22d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t23d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t24d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t25d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t26d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t27d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t28d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t29d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 1 0 1 c9t30d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t31d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t32d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 1 0 1 c9t33d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t34d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t35d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t36d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t37d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t38d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t39d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t40d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t41d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t42d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t43d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t44d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t45d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t46d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t47d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t48d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t49d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t50d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 0 0 0 c9t51d0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0 100 0 1 0 1 c9t52d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0 0 0 0 c9t55d0 During our previous testing, we had tried even setting this max_pending value down to 1, but we still hit the problem (albeit it took a little longer to hit it) and I couldn't find anything else I could set to throttle IO to the disk, hence the frustration. If you hadn't seen this output, would you say that 7 was a "reasonable" value for that max_pending queue for our architecture and should give the LSI controller in this situation enough breathing room to operate? If so, I *should* be able to scrub the disks successfully (ZFS isn't to blame) and therefore have to point the finger at the mpt-driver/LSI-firmware/disk-firmware instead. -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss