Wow, thanks Bjoern for that, now I really know what was going on. I
really appreciate the time you took to explain all that.

The problem I'm facing is, that I can't detach the drive. A "zpool
detach pool diskx" gives me the error:
"cannot detach diskx: only applicable to mirror and replacing vdevs."

I managed to format the disk as hfs+, zero the drive completely and
then format as zfs, but zfs still considers this disk as one of the pool.

What can I do to get the drive out of the pool?

On 26.10.14 14:43, Bjoern Kahl wrote:
> 
> (this is going to be a bit longer, but since it is a reappearing
> topic I'd like to provide some background information on what
> happens behind the scene)
> 
> 
> Am 26.10.14 um 12:09 schrieb 'Busty' via zfs-macos:
>> This generated a follow up question:
> 
>> I did the zpool replace with an unformatted disk as described in 
>> the oracle documentation. After that, zpool status showed the
>> disk as part of the pool, but as "disk2", not as "disk2s2".
>> Accordingly, OSX wanted to initialize the disk every time upon
>> booting.
> 
>> So I formatted the disk as described in the getting started
>> guide on MacZFS, which resolves the problem of OSX wanting to
>> initialize the disk, but still it shows as "disk2" (without the
>> s2) with zpool status. I was prepared to resilver the disk again
>> after that, but it was still part of the pool.
> 
>> I started a scrub, had 6 checksum errors on that disk right at
>> the beginning, but otherwise the scrub seems to consider the data
>> as good. It is at 7 percent right now.
> 
>> Should I be worried that the data is not integer?
> 
> Yes, you should.
> 
> You basically did the following:
> 
> 1)
> 
> Gave a whole disk to ZFS, telling it, it is OK to use the whole
> space from first to last block of the disk.
> 
> ZFS did so and started writing data:
> 
> a) it's vdev label 0,1 from block 0 to 1023 (assuming 512 byte
> blocks)
> 
> b) it's vdev label 2,3 from block N-1024 to N-1 (assuming N block
> on disk)
> 
> c) your pool data in between, following it's somewhat complex 
> allocation scheme
> 
> 
> 2)
> 
> Told OS X to write a disk label (aka GPT) on the disk.
> 
> OS X did so and started writing data:
> 
> a) A protective MBR in block 0 -> no damage, ZFS anticipates that,
> leaving block 0 to 32 (16k) of its label alone.
> 
> b) The primary GPT structures, starting from block 1 (byte
> position 512) to end of block 33 (byte position 17408). This
> trashed part of the configuration dictionary in vdev label 0
> 
> c) The secondary GPT structures, in the last 17408 bytes of the
> disk, overwriting part of the uberblock array in vdev label 3.
> 
> d) The Mac OS X EFI area, usually around block 40 to 409600 (byte 
> positions up to 200 MB). This is "/dev/diskXs1".
> 
> e) The man partition "/dev/diskXs2", roughly starting at block
> 409640 and extending until some blocks before the secondary GPT
> structures. This is just created but nor written in "noformat" has
> been used.
> 
> 
> 
> What does this mean? --------------------
> 
> 
> It depends on how ZFS sees the disk.  Most likely it will continue
> to use "diskX" (no slice).  In that case:
> 
> The pool keeps functioning, since vdev labels 1 and 2 are undamaged
> (0 and 3 are overwritten, see above)
> 
> ZFS will almost instantly fix it's labels, completely overwriting
> the secondary GPT.  Mac OS X doesn't care, it writes the secondary
> GPT and never looks there again.
> 
> The situation on the is start is more complex.
> 
> ZFS will also almost instantly fix its label 0. However, this
> writes only from block 32 on (byte position 16384 onwards), since
> it completely ignores the first 16 blocks (supposed to hold disk 
> identifier) and doesn't touch the next 16 in normal operation,
> since they are supposed to hold ZFS boot code and are unused in
> current implementations.
> 
> So the rewritten vdev label 0 trashes the last 512 bytes of the
> primary GPT.  This does concern Mac OS X and you should see a
> waring about an invalid GPT CRC in the system log after boot.
> 
> 
> So much for the administrative data structures.  What about your
> data?
> 
> ZFS' data area starts after the vdev label 1, i.e. at block 1024 
> (byte position 512 kB).  This is somewhere inside the EFI area, 
> overwriting whatever Mac OS X placed there (depends on version,
> older Mac OS X version didn't placed anything there, don't know for
> newer versions).  In any case, Mac OS X does not access the EFI
> area in normal operation, and so won't note the damage.
> 
> On the other hand, Mac OS X is initializing the EFI area when 
> initializing a disk, placing an empty FAT file system there.
> 
> This FAT overwrites part of the ZFS pool data and caused the
> checksum errors.
> 
> 
> What to do now? ---------------
> 
> I would detach the disk in question, zap the first and last several
> MB of disk space (i.e. of diskX itself, not of the diskX2s slice)
> by writing zero bytes to disk, for example using "dd", reformat
> with diskutil and reattach as /dev/diskX2s.
> 
> Another approach for zapping the disk content is, to format as
> HFS+ with diskutil and then select "clear/erase free disk space"
> (or whatever the English button label says).
> 
> 
> Best regards
> 
> Björn
> 
>> On 23.10.14 14:01, 'Busty' via zfs-macos wrote:
> 
>>> This was in fact easier than I thought. What did the trick was 
>>> to physically swap the faulty disk with a new one and then
>>> "zpool detach (faulty disk)"
>>> 
>>> After that a "zpool replace" went like a charm.
>>> 
>>> Problem solved.
>>> 
>>> On 15.10.14 20:32, 'Busty' via zfs-macos wrote:
>>>> In my pool, I had a disk that got a smart error (bad block),
>>>> so I pulled it out, installed a new one and made a "zpool
>>>> replace disk5s2 806745480046791602". (That number was shown
>>>> when typing "zpool status" as the missing device.)
>>>> 
>>>> The resilver process started, but it seems that the new disk
>>>> is faulty, because it disappears from the device list 
>>>> infrequently, but still at least every 6 hours (I have 
>>>> Temperature Monitor running which shows me all disks by
>>>> serial number).
>>>> 
>>>> So I want to change it. But zpool detach <poolname>
>>>> dev/disk5s2 gives the error "no such device in pool".
>>>> 
>>>> How can I abort the resilvering process? Or is there another 
>>>> way to restart the resilvering with a new disk?
>>>> 
>>>> The original disk with the bad block is already on its way
>>>> to Western Digital (it was still in warranty).
> 
> 
> 

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to