I had a 4 disk RAIDZ1 array. I did not monitor it's status as closely
as I should have. My first sign of trouble was that programs doing
writes all locked up. When I looked two drives in the array where
showing problems via "zpool status".
After purchasing more hard drives to move the data to (which is fun
given the shortages at the moment. Ended up getting pass amazon
limiting one per order by having my wife, brother and sister all order
for me.) I'm looking at recovering the data.
The first bad drive cannot be seen by the system at all. I spins up
but then clicks. Let's assume it's ~$3000 trip to the data recovery
people away from being read from ever again.
The second bad drive can be seen from the system. smartctl reports
that the disk is failing, but I was able to use ddrescue
to a full copy of the device. ddrescue did find some errors but it seems
to have worked around them.
The other two disks seem to be fine.
I added the image of the the second bad drive to loop0 and made
symlinks of loop0 and the other two drive device files into $PWD and
tried to import
$ zpool import -f -d . bank0
cannot import 'bank0': I/O error
Destroy and re-create the pool from
a backup source.
$ zpool import -fFX -d . bank0
# runs for 6 hours and then prints out something like "one or more
devices is currently unavailable"
Looking at the output of "zdb -ve bank0", I think what's happening is
that the disk image is marked as "faulted: 1" and "aux_state:
'err_exceeded'". Perhaps if that could be cleared, then the import
would work? I think if this was a pool that was already imported you
could clear this error with a zpool clear bank0 $DEV_NAME
Output of "zdb -ve bank0":
Configuration for import:
zfs-discuss mailing list