I have a box running snv_134 that had a little boo-boo. The problem first started a couple of weeks ago with some corruption on two filesystems in a 11 disk 10tb raidz2 set. I ran a couple of scrubs that revealed a handful of corrupt files on my 2 de-duplicated zfs filesystems. No biggie.
I thought that my problems had something to do with de-duplication in 134, so I went about the process of creating new filesystems and copying over the "good" files to another box. Every time I touched the "bad" files I got a filesystem error 5. When trying to delete them manually, I got kernel panics - which eventually turned into reboot loops. I tried installing nexenta on another disk to see if that would allow me to get passed the reboot loop - which it did. I finished moving the "good" files over (using rsync, which skipped over the error 5 files, unlike cp or mv), and destroyed one of the two filesystems. Unfortunately, this caused a kernel panic in the middle of the destroy operation, which then became another panic / reboot loop. I was able to get in with milestone=none and delete the zfs cache, but now I have a new problem: Any attempt to import the pool results in a panic. I have tried from my snv_134 install, from the live cd, and from nexenta. I have tried various zdb incantations (with aok=1 and zfs:zfs_recover=1), to no avail - these error out after a few minutes. I have even tried another controller. I have zdb -e -bcsvL running now from 134 (without aok=1) which has been running for several hours. Can zdb recover from this kind of situation (with a half-destroyed filesystem that panics the kernel on import?) What is the impact of the above zdb operation without aok=1? Is there any likelihood of a recovery of non-affected filesystems? Any suggestions? Regards, Matthew Ellison _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss