> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Karl Rossing > > So i figured out after a couple of scrubs and fmadm faulty that drive > c9t15d0 was bad. > > My pool now looks like this: > NAME STATE READ WRITE CKSUM > vdipool DEGRADED 0 0 2 > raidz1 DEGRADED 0 0 4 > c9t14d0 ONLINE 0 0 1 512 resilvered > spare DEGRADED 0 0 0 > c9t15d0 OFFLINE 0 0 0 > c9t19d0 ONLINE 0 0 0 16.1G resilvered > c9t16d0 ONLINE 0 0 1 512 resilvered > c9t17d0 ONLINE 0 0 5 2.50K resilvered > c9t18d0 ONLINE 0 0 1 512 resilvered > spares > c9t19d0 INUSE currently in use
Um... Call me crazy, but ... If c9t15d0 was bad, then why do all those other disks have checksum errors on them? Although what you said is distinctly possible (faulty disk behaves so badly that it causes all the components around it to also exhibit failures), it seems unlikely. It seems much more likely that a common component (hba, ram, etc) is faulty, which could possibly be in addition to c9t15d0. Another possibility is that the faulty hba (or whatever) caused a false positive on c9t15d0. Maybe c9t15d0 isn't any more unhealthy than all the other drives on that bus, which may all be bad, or they may all be good including c9t15d0. (It wouldn't be the first time I've seen a whole batch of disks be bad, from the same mfgr with closely related serial numbers and mfgr dates.) I think you have to explain the checksum errors on all the other disks before drawing any conclusions. And the fact that it resilvered immediately after it resilvered... Only lends more credence to my suspicion in your bad-disk-diagnosis. BTW, what OS and what hardware are you running? How long has it been running, and how much attention do you give it? That is - Can you confidently say it was running without errors for 6 months and then suddenly started exhibiting this behavior? If this is in fact a new system, or if you haven't been paying much attention, I would not be surprised to see this type of behavior if you're running on unsupported or generic hardware. And when I say "unsupported" or "generic" I mean ... Intel, Asus, Dell, HP, etc, big name brands count as "unsupported" and "generic." Basically anything other than sun hardware and software fully updated and still in support contract, if I'm exaggerating to the extreme. _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss