Liam Slusser wrote:
Long story short, my cat jumped on my server at my house crashing two drives at 
the same time.  It was a 7 drive raidz (next time ill do raidz2).

Long story short - we've been able to get access to data in the pool. This involved finding better old state with the help of 'zdb -t', then verifying metadata checksums with 'zdb -eubbcsL', then extracting configuration from the pool, making cache file from the extracted configuration and finally importing pool (readonly at the moment) to back up data.

As soon as it is backed up, we'll try to do read-write import...

victor


The server crashed complaining about a drive failure, so i rebooted into single 
user mode not realizing that two drives failed.  I put in a new 500g 
replacement and had zfs start a replace operation which failed at about 2% 
because there was two broken drives.  From that point i turned off the computer 
and sent both drives to a data recovery place.  They were able to recover the 
data on one of the two drives (the one that i started the replace operation on) 
- great - that should be enough to get my data back.

I popped the newly recovered drive back in, it had an older tgx number then the 
other drives so i made a backup of each drive and then modified the tgx number 
to an earlier tgx number so they all match.

However i am still unable to mount the array - im getting the following error: 
(doesnt matter if i use -f or -F)

bash-3.2# zpool import data
  pool: data
    id: 6962146434836213226
 state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
        devices and try again.
   see: http://www.sun.com/msg/ZFS-8000-6X
config:

        data           UNAVAIL  missing device
          raidz1       DEGRADED
            c0t0d0     ONLINE
            c0t1d0     ONLINE
            replacing  ONLINE
              c0t2d0   ONLINE
              c0t7d0   ONLINE
            c0t3d0     UNAVAIL  cannot open
            c0t4d0     ONLINE
            c0t5d0     ONLINE
            c0t6d0     ONLINE

        Additional devices are known to be part of this pool, though their
        exact configuration cannot be determined.

Now i should have enough online devices to mount and get my data off however no 
luck.  I'm not really sure where to go at this point.

Do i have to fake a c0t3d0 drive so it thinks all drives are there?  Can 
somebody point me in the right direction?

thanks,
liam



p.s.  To help me find which uberblocks to modify to reset the tgx i wrote a 
little perl program which finds and prints out information in order to revert 
to an earlier tgx value.

Its a little messy since i wrote it super late at night quickly - but maybe it 
will help somebody else out.

http://liam821.com/findUberBlock.txt (its just a perl script)

Its easy to run.  It pulls in 256k of data and sorts it (or skipping X kbyte if 
you use the -s ###) and then searches for uberblocks.  (remember there is 4 
labels, 0 256, and then two at the end of the disk.  You need to manually 
figure out the end skip value...)  Calculating the GUID seems to always fail 
because the number is to large for perl so it returns a negative number.  meh 
wasnt important enough to try to figure out.

(the info below has NOTHING to do with my disk problem above, its a happy and 
health server that i wrote the tool on)

- find newest tgx number
bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -n
block=148 (0025000) transaction=15980419

- print verbose output
bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -n -v
block=148 (0025000)
    zfs_ver=3                           (0003 0000 0000 0000)
    transaction=15980419                        (d783 00f3 0000 0000)
    guid_sum=-14861410676147539 (7aad 2fc9 33a0 ffcb)
    timestamp=1253958103                (e1d7 4abd 0000 0000)
        (Sat Sep 26 02:41:43 2009)

    raw =       0025000 b10c 00ba 0000 0000 0003 0000 0000 0000
                0025010 d783 00f3 0000 0000 7aad 2fc9 33a0 ffcb
                0025020 e1d7 4abd 0000 0000 0001 0000 0000 0000

- list all uberblocks
bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -l
block=145 (0024400) transaction=15980288
block=146 (0024800) transaction=15980289
block=147 (0024c00) transaction=15980290
block=148 (0025000) transaction=15980291
block=149 (0025400) transaction=15980292
block=150 (0025800) transaction=15980293
block=151 (0025c00) transaction=15980294
block=152 (0026000) transaction=15980295
block=153 (0026400) transaction=15980296
block=154 (0026800) transaction=15980297
block=155 (0026c00) transaction=15980298
block=156 (0027000) transaction=15980299
block=157 (0027400) transaction=15980300
block=158 (0027800) transaction=15980301
.
.
.

- skip to 256 into the disk and find the newest uberblock
bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -n -s 256
block=507 (7ec00) transaction=15980522

Now lets say i want to go back in time on this, using the program can help me 
do that.  If i wanted to go back in time to tgx 15980450...

bash-3.00# /tmp/findUberBlock /dev/dsk/c0t1d0 -t 15980450
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=180 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=181 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=182 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=183 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=184 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=185 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=186 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=187 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=188 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=189 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=190 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=191 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=192 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=193 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=194 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=195 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=196 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=197 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=198 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=199 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=200 count=1 conv=notrunc
dd if=/dev/zero of=/dev/dsk/c0t1d0 bs=1k oseek=201 count=1 conv=notrunc

It prints out the DD commands you want to use to do it.  It wouldn't run it for 
you!

Anyway, maybe it will help somebody out sometime...

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to