Re: [zfs-discuss] Corrupt meta data, the coredump

Richard Elling Thu, 11 Jun 2009 08:36:11 -0700

This sounds like
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6587723
which was fixed a long time ago.  You might check that bug against your
stack trace (which was not included in this post).


You may be able to boot from a later OS release and import/export the pool
to repair.
-- richard

Timh Bergström wrote:

Hi all,

I've encountered a not so fun problem with one of our pools, the pool
was built with raidz1 according to the zfs-manual, the discs was
imported through an ERQ 16x750GB FC-Array (exported as JBOD) via
(QLogic) FC-HBA's to Solaris 10u3 (x86). Everything have worked fine
and dandy until this morning when the disc-enclosure "crashed" (Reason
unknown) and subsequently dragged the whole system with it, I didnt
get the coredump at the time but now when i've restarted and
reattached the enclosure and tried to import the zpool again I got the
following:

# zpool status -vx
pool: migrated_data
state: FAULTED
status: The pool metadata is corrupted and the pool cannot be opened.
action: Destroy and re-create the pool from a backup source.
see: http://www.sun.com/msg/ZFS-8000-CS
scrub: none requested
config:
...

And just a couple of seconds after zpool status -vx the machine coredumps with:

panic[cpu0]/thread=fffffe80fcd34ba0: BAD TRAP: type=e (#pf Page fault) rp=fffffe
800138cb10 addr=0 occurred in module "zfs" due to a NULL pointer dereference
zpool: #pf Page fault
Bad kernel fault at addr=0x0
pid=1116, pc=0xfffffffff0663b45, sp=0xfffffe800138cc00, eflags=0x10202
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f0<xmme,fxsr,pge,mce,pae,pse>
cr2: 0 cr3: e5f2000 cr8: c
         rdi: ffffffff80039200 rsi: ffffffff89d883c0 rdx:                0
         rcx: fffffe80e3667000  r8:                1  r9:                0
         rax:                0 rbx:                1 rbp: fffffe800138cc10
         r10: ffffffff938eb920 r11:                3 r12: ffffffffb0bc4080
         r13: ffffffffb0bc42f0 r14:                1 r15:                0
         fsb: ffffffff80000000 gsb: fffffffffbc240e0  ds:               43
         es:               43  fs:                0  gs:              1c3
         trp:                e err:                0 rip: fffffffff0663b45
          cs:               28 rfl:            10202 rsp: fffffe800138cc00
          ss:               30
...

This occurs a couple of seconds after the system is fully booted, i've
tried several times to be fast enough to unconfigure the
fc-controllers but.. to slow :-) . So I shut the path for the machine
to the FC-enclosure, and of course the pool is now "UNAVAIL" which is
ok since my other pools work fine.

Im curious though - how can metadata be corrupted like that? Why does
the system panic? Can it be repaired?

I know I should have backups but I dont, and if it's a lost cause it's
fine, the data itself is not important.

_______________________________________________
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Corrupt meta data, the coredump

Reply via email to