Timh Bergström wrote:
Hi,
It indeed does, I am running a really old version of zfs (3?) so i
figured a newer release would atleast not panic, but the bug report
shows exactly what I saw.
A newer release should not panic, or at least not at the same place.
If it does, then we might be seeing a regression, which would need a
new bug to be filed against it.
-- richard
I'll give it a shot, thanks.
//Timh
Den den 11 juni 2009 17:35 skrev Richard Elling<[email protected]>:
This sounds like
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6587723
which was fixed a long time ago. You might check that bug against your
stack trace (which was not included in this post).
You may be able to boot from a later OS release and import/export the pool
to repair.
-- richard
Timh Bergström wrote:
Hi all,
I've encountered a not so fun problem with one of our pools, the pool
was built with raidz1 according to the zfs-manual, the discs was
imported through an ERQ 16x750GB FC-Array (exported as JBOD) via
(QLogic) FC-HBA's to Solaris 10u3 (x86). Everything have worked fine
and dandy until this morning when the disc-enclosure "crashed" (Reason
unknown) and subsequently dragged the whole system with it, I didnt
get the coredump at the time but now when i've restarted and
reattached the enclosure and tried to import the zpool again I got the
following:
# zpool status -vx
pool: migrated_data
state: FAULTED
status: The pool metadata is corrupted and the pool cannot be opened.
action: Destroy and re-create the pool from a backup source.
see: http://www.sun.com/msg/ZFS-8000-CS
scrub: none requested
config:
...
And just a couple of seconds after zpool status -vx the machine coredumps
with:
panic[cpu0]/thread=fffffe80fcd34ba0: BAD TRAP: type=e (#pf Page fault)
rp=fffffe
800138cb10 addr=0 occurred in module "zfs" due to a NULL pointer
dereference
zpool: #pf Page fault
Bad kernel fault at addr=0x0
pid=1116, pc=0xfffffffff0663b45, sp=0xfffffe800138cc00, eflags=0x10202
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f0<xmme,fxsr,pge,mce,pae,pse>
cr2: 0 cr3: e5f2000 cr8: c
rdi: ffffffff80039200 rsi: ffffffff89d883c0 rdx: 0
rcx: fffffe80e3667000 r8: 1 r9: 0
rax: 0 rbx: 1 rbp: fffffe800138cc10
r10: ffffffff938eb920 r11: 3 r12: ffffffffb0bc4080
r13: ffffffffb0bc42f0 r14: 1 r15: 0
fsb: ffffffff80000000 gsb: fffffffffbc240e0 ds: 43
es: 43 fs: 0 gs: 1c3
trp: e err: 0 rip: fffffffff0663b45
cs: 28 rfl: 10202 rsp: fffffe800138cc00
ss: 30
...
This occurs a couple of seconds after the system is fully booted, i've
tried several times to be fast enough to unconfigure the
fc-controllers but.. to slow :-) . So I shut the path for the machine
to the FC-enclosure, and of course the pool is now "UNAVAIL" which is
ok since my other pools work fine.
Im curious though - how can metadata be corrupted like that? Why does
the system panic? Can it be repaired?
I know I should have backups but I dont, and if it's a lost cause it's
fine, the data itself is not important.
_______________________________________________
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss