I recently discovered a drive failure (either that or a loose cable, I
need to investigate further) on my home fileserver. 'fmadm faulty'
returns no output, but I can clearly see a failure when I do zpool
status -v:

pool: tank
state: DEGRADED
status: One or more devices has been removed by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
scan: scrub canceled on Tue Feb  1 11:51:58 2011
config:

        NAME         STATE     READ WRITE CKSUM
        tank         DEGRADED     0     0     0
          raidz2-0   DEGRADED     0     0     0
            c10t0d0  ONLINE       0     0     0
            c10t1d0  ONLINE       0     0     0
            c10t2d0  ONLINE       0     0     0
            c10t3d0  REMOVED      0     0     0
            c10t4d0  ONLINE       0     0     0
            c10t5d0  ONLINE       0     0     0
            c10t6d0  ONLINE       0     0     0
            c10t7d0  ONLINE       0     0     0

In dmesg, I see:
Feb  1 11:14:33 megatron scsi: [ID 107833 kern.warning] WARNING:
/pci@0,0/pci8086,2e21@1/pci15d9,a580@0/sd@3,0 (sd8):
Feb  1 11:14:33 megatron        Command failed to complete...Device is gone

never had any problems with these drives + mpt in snv_134 (on snv_151a
now), only change was adding a second 1068E-IT that's currently
unpopulated with drives. But more importantly I guess, why can't I see
this failure in fmadm (and how would I go about setting up
automatically dispatching an e-mail to me when stuff like this
happens?)? Is a pool going degraded != to failure?

-- 
--khd
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to