Re: [zfs-discuss] Spare drive inherited cksum errors?

Stephan Budach Tue, 29 May 2012 00:04:11 -0700

Hi Richard,

Am 29.05.12 06:54, schrieb Richard Elling:

On May 28, 2012, at 9:21 PM, Stephan Budach wrote:
Hi all,
just to wrap this issue up: as FMA didn't report any other error thanthe one which led to the degradation of the one mirror, I detachedthe original drive from the zpool which flagged the mirror vdev asONLINE (although there was still a cksum error count of 23 on thespare drive).
You showed the result of the FMA diagnosis, but not the error reports.
One feature of the error reports on modern Solaris is that theexpected and reported
bit images are described, showing the nature and extent of the corruption.

Are you referring to these errors:

root@solaris11c:~# fmdump -e -u f0601f5f-cb8b-67bc-bd63-e71948ea8428
TIME                 CLASS
Mai 27 10:24:23.3654 ereport.fs.zfs.checksum
Mai 27 10:24:23.3652 ereport.fs.zfs.checksum
Mai 27 10:24:23.3650 ereport.fs.zfs.checksum
Mai 27 10:24:23.3648 ereport.fs.zfs.checksum
Mai 27 10:24:23.3646 ereport.fs.zfs.checksum
Mai 27 10:24:23.2696 ereport.fs.zfs.checksum
Mai 27 10:24:23.2694 ereport.fs.zfs.checksum
Mai 27 10:24:23.2692 ereport.fs.zfs.checksum
Mai 27 10:24:23.2690 ereport.fs.zfs.checksum
Mai 27 10:24:23.2688 ereport.fs.zfs.checksum
Mai 27 10:24:23.2686 ereport.fs.zfs.checksum

And to pick one in detail:

root@solaris11c:~# fmdump -eV -u f0601f5f-cb8b-67bc-bd63-e71948ea8428
TIME                           CLASS
Mai 27 2012 10:24:23.365451280 ereport.fs.zfs.checksum
nvlist version: 0
        class = ereport.fs.zfs.checksum
        ena = 0xdfb23b0bc9700001
        detector = (embedded nvlist)
        nvlist version: 0
                version = 0x0
                scheme = zfs
                pool = 0x855ebc6738ef6dd6
                vdev = 0x52e3ca377dbdbec9
        (end detector)

        pool = obelixData
        pool_guid = 0x855ebc6738ef6dd6
        pool_context = 0
        pool_failmode = wait
        vdev_guid = 0x52e3ca377dbdbec9
        vdev_type = disk
        vdev_path = /dev/dsk/c9t2100001378AC02F4d9s0
        vdev_devid = id1,sd@n2047001378ac02f4/a
        parent_guid = 0x695bf14bdabd6714
        parent_type = mirror
        zio_err = 50
        zio_offset = 0x2d8b974600
        zio_size = 0x20000
        zio_objset = 0x81ea9
        zio_object = 0x5594
        zio_level = 0
        zio_blkid = 0x3c

cksum_expected = 0x12869460bd5d 0x49e4661395e69730xc974c2622ce7a035 0x81fe9ef14082a245cksum_actual = 0x1bba2b185478 0x707883eac587dd30x54de998365cc6a8d 0x6822e5f4add45237

        cksum_algorithm = fletcher4
        bad_ranges = 0x0 0x20000
        bad_ranges_min_gap = 0x8
        bad_range_sets = 0x357a5
        bad_range_clears = 0x3935b

bad_set_histogram = 0x8f3 0xdd4 0x52c 0x13d0 0xd76 0xea0 0xec10x100f 0x8f0 0xdc7 0x51e 0x13e7 0xd6b 0xe87 0xf30 0xf9c 0x8cd 0xddc0x51a 0x1458 0xd93 0xf0a 0xf04 0x102d 0x8b4 0xdea 0x51a 0x141d 0xdd30xefc 0xf18 0x1003 0x8bc 0xde9 0x52f 0x13a4 0xdd9 0xf07 0xea2 0x100d0x8c1 0xdf4 0x4e6 0x1368 0xdce 0xed9 0xf27 0x1002 0x8bf 0xdf4 0x4fe0x1396 0xd7d 0xee0 0xf2b 0xfcc 0x8d8 0xdd7 0x4fc 0x13b8 0xd8e 0xe8b0xedb 0x100ebad_cleared_histogram = 0x0 0x46 0x211a 0xc77 0x124f 0x11460x113b 0x1020 0x0 0x35 0x20df 0xc9f 0x12dc 0x110c 0x10fc 0x1018 0x0 0x370x2103 0xcbb 0x12a9 0x113d 0x1100 0xf8d 0x0 0x35 0x210d 0xc6e 0x121a0x1171 0x108f 0x1020 0x0 0x46 0x20ec 0xc3f 0x12ba 0x10ce 0x1172 0x10090x0 0x47 0x20a4 0xc5e 0x129f 0x1102 0x112e 0x1031 0x0 0x4a 0x20d1 0xc640x126b 0x1159 0x111c 0x1074 0x0 0x3a 0x20ed 0xc5b 0x1245 0x1160 0x111c 0xfc0

        __ttl = 0x1
        __tod = 0x4fc1e4b7 0x15c85810

They were all from the same vdev_path and ranged through these block IDs:

        zio_blkid = 0x3c
        zio_blkid = 0x3e
        zio_blkid = 0x40
        zio_blkid = 0x3a
        zio_blkid = 0x3d
        zio_blkid = 0xf
        zio_blkid = 0xc
        zio_blkid = 0x10
        zio_blkid = 0x12
        zio_blkid = 0x14
        zio_blkid = 0x11

I really was a bit surprised by the cksum errors on the spare drive,especially when no errors had been logged for the spare drive while itwas resilvering.


We'll see what the scrub will tell us.

Thanks,
budy

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Spare drive inherited cksum errors?

Reply via email to