Re: [zfs-discuss] Corrupt meta data, the coredump

Timh Bergström Thu, 11 Jun 2009 21:34:49 -0700

Hi,

It indeed does, I am running a really old version of zfs (3?) so i
figured a newer release would atleast not panic, but the bug report
shows exactly what I saw.


I'll give it a shot, thanks.

//Timh

Den den 11 juni 2009 17:35 skrev Richard Elling<[email protected]>:
> This sounds like
> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6587723
> which was fixed a long time ago.  You might check that bug against your
> stack trace (which was not included in this post).
>
> You may be able to boot from a later OS release and import/export the pool
> to repair.
> -- richard
>
> Timh Bergström wrote:
>>
>> Hi all,
>>
>> I've encountered a not so fun problem with one of our pools, the pool
>> was built with raidz1 according to the zfs-manual, the discs was
>> imported through an ERQ 16x750GB FC-Array (exported as JBOD) via
>> (QLogic) FC-HBA's to Solaris 10u3 (x86). Everything have worked fine
>> and dandy until this morning when the disc-enclosure "crashed" (Reason
>> unknown) and subsequently dragged the whole system with it, I didnt
>> get the coredump at the time but now when i've restarted and
>> reattached the enclosure and tried to import the zpool again I got the
>> following:
>>
>> # zpool status -vx
>> pool: migrated_data
>> state: FAULTED
>> status: The pool metadata is corrupted and the pool cannot be opened.
>> action: Destroy and re-create the pool from a backup source.
>> see: http://www.sun.com/msg/ZFS-8000-CS
>> scrub: none requested
>> config:
>> ...
>>
>> And just a couple of seconds after zpool status -vx the machine coredumps
>> with:
>>
>> panic[cpu0]/thread=fffffe80fcd34ba0: BAD TRAP: type=e (#pf Page fault)
>> rp=fffffe
>> 800138cb10 addr=0 occurred in module "zfs" due to a NULL pointer
>> dereference
>> zpool: #pf Page fault
>> Bad kernel fault at addr=0x0
>> pid=1116, pc=0xfffffffff0663b45, sp=0xfffffe800138cc00, eflags=0x10202
>> cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f0<xmme,fxsr,pge,mce,pae,pse>
>> cr2: 0 cr3: e5f2000 cr8: c
>>         rdi: ffffffff80039200 rsi: ffffffff89d883c0 rdx:                0
>>         rcx: fffffe80e3667000  r8:                1  r9:                0
>>         rax:                0 rbx:                1 rbp: fffffe800138cc10
>>         r10: ffffffff938eb920 r11:                3 r12: ffffffffb0bc4080
>>         r13: ffffffffb0bc42f0 r14:                1 r15:                0
>>         fsb: ffffffff80000000 gsb: fffffffffbc240e0  ds:               43
>>         es:               43  fs:                0  gs:              1c3
>>         trp:                e err:                0 rip: fffffffff0663b45
>>          cs:               28 rfl:            10202 rsp: fffffe800138cc00
>>          ss:               30
>> ...
>>
>> This occurs a couple of seconds after the system is fully booted, i've
>> tried several times to be fast enough to unconfigure the
>> fc-controllers but.. to slow :-) . So I shut the path for the machine
>> to the FC-enclosure, and of course the pool is now "UNAVAIL" which is
>> ok since my other pools work fine.
>>
>> Im curious though - how can metadata be corrupted like that? Why does
>> the system panic? Can it be repaired?
>>
>> I know I should have backups but I dont, and if it's a lost cause it's
>> fine, the data itself is not important.
>>
>>
>



-- 
Timh Bergström
System Operations Manager
Diino AB - www.diino.com
:wq
_______________________________________________
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Corrupt meta data, the coredump

Reply via email to