On Sat, Nov 5, 2011 at 2:35 PM, Myers Carpenter <my...@maski.org> wrote:
> I would like to pick the brains of the ZFS experts on this list: What > would you do next to try and recover this zfs pool? > I hate running across threads that ask a question and the person that asked them never comes back to say what they eventually did, so... To summarize: In late October I had two drives fail in a raidz1 pool. I was able to recover all the data from one drive, but the other could not be seen by the controller. Trying to zpool import was not working. I had 3 of the 4 drives, why couldn't I mount this. I read about every option in zdb and tried ones that might tell me something more about what was on this recovered drive. I eventually hit on zdb -p devs -vvvve -lu /bank4/hd/devs/loop0 where /bank4/hd/devs/loop0 was a symlink back to /dev/loop0 where I had setup the disk image of the recovered drive. This showed the uberblocks which looked like this: Uberblock[1] magic = 0000000000bab10c version = 26 txg = 23128193 guid_sum = 13396147021153418877 timestamp = 1316987376 UTC = Sun Sep 25 17:49:36 2011 rootbp = DVA[0]=<0:2981f336c00:400> DVA[1]=<0:1e8dcc01400:400> DVA[2]=<0:3b16a3dd400:400> [L0 DMU objset] fletcher4 lzjb LE contiguous unique triple size=800L/200P birth=23128193L/23128193P fill=255 cksum=136175e0a4:79b27ae49c7:1857d594ca833:34ec76b965ae40 Then it all came clear: This drive had encountered errors one month before the other drive had failed and zfs had stopped writing to it. So the lesson here: Don't be a dumbass like me. Setup up nagios or some other system to alert you when a pool has become degraded. ZFS works very well with one drive out of the array, you aren't probably going to notice problems unless you are proactively looking for them. myers
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss