Hey Jason,

not really that frustrated, as I feel I'm working my way towards the
solution with the help of you maczfs guys.

I clearly didn't think that out when telling zfs that it is ok to use
the whole disk instead of the s2 slice.

The issue seems to be that I can't tell zfs that I want to start from
scratch with that disk, zfs always recognizes the disk as already being
part of the pool. As a whole.

So, the options I see:

- I can either physically replace the disk with a new one, this time
formatting it as zfs before telling zfs to replace it

- I can build the pool from scratch

(I would go for building the pool from scratch, as the disk in question
is working when installed properly. Additionally, I don't have to buy
another disk and wait for it.)

What do you guys think: is there another option?


zpool status gives me:

Server:~ busty$ zpool status
  pool: Collection
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

        NAME         STATE     READ WRITE CKSUM
        Collection   ONLINE       0     0     0
          raidz1     ONLINE       0     0     0
            disk5s2  ONLINE       0     0     0
            disk4s2  ONLINE       0     0     0
            disk7s2  ONLINE       0     0     0
            disk3s2  ONLINE       0     0     0
            disk2    ONLINE       0     0     5
            disk1s2  ONLINE       0     0     0
            disk6s2  ONLINE       0     0     0

errors: No known data errors

But I bet I get a pocketful (big pocket) of errors on the disk2 when
doing a scrub, since I zeroed the disk completely.




On 29.10.14 12:19, Jason Belec wrote:
> If I understand what I'm reading here, you have a disk that is in your pool 
> and the pool is raidz, so you must always have the same number of devices 
> attached to the pool, this is a raidz law. You can replace a new one with a 
> damaged one, but you cannot remove the damaged one until the replace/resilver 
> is complete. You cannot stop a resilver once it has begun, your going to have 
> to be patient. Once done, you can proceed with rectifying the issue. The 
> issues you are running into are due to not reading up and testing before 
> committing, and it seems to happen a lot. ZFS seems frustrating to you right 
> now because it is doing everything possible to protect data your messing 
> with. ;)
> 
> 
> --
> Jason Belec
> Sent from my iPad
> 
>> On Oct 29, 2014, at 6:50 AM, 'Busty' via zfs-macos 
>> <zfs-macos@googlegroups.com> wrote:
>>
>> thanks for the input but:
>>
>> "only inactive hot spares can be removed", whereas I need to
>> remove/detach/whatever one disk of a raidz1 pool, no mirrors, no duplicates.
>>
>> I get the impression there is no way to do that, so I might have to
>> build the pool from scratch again, am I right?
>>
>>
>>> On 29.10.14 09:49, ilove...@icloud.com wrote:
>>> zpool attach makes a non-mirror into a mirror. zpool detach makes a mirror 
>>> into a non-mirror.
>>>
>>> I believe you are looking for zpool remove.
>>>
>>>> On Wednesday, October 29, 2014 12:54:48 AM UTC-7, Busty wrote:
>>>>
>>>> Wow, thanks Bjoern for that, now I really know what was going on. I 
>>>> really appreciate the time you took to explain all that. 
>>>>
>>>> The problem I'm facing is, that I can't detach the drive. A "zpool 
>>>> detach pool diskx" gives me the error: 
>>>> "cannot detach diskx: only applicable to mirror and replacing vdevs." 
>>>>
>>>> I managed to format the disk as hfs+, zero the drive completely and 
>>>> then format as zfs, but zfs still considers this disk as one of the pool. 
>>>>
>>>> What can I do to get the drive out of the pool? 
>>>>
>>>>> On 26.10.14 14:43, Bjoern Kahl wrote: 
>>>>>
>>>>> (this is going to be a bit longer, but since it is a reappearing 
>>>>> topic I'd like to provide some background information on what 
>>>>> happens behind the scene) 
>>>>>
>>>>>
>>>>>> Am 26.10.14 um 12:09 schrieb 'Busty' via zfs-macos: 
>>>>>> This generated a follow up question:
>>>>>
>>>>>> I did the zpool replace with an unformatted disk as described in 
>>>>>> the oracle documentation. After that, zpool status showed the 
>>>>>> disk as part of the pool, but as "disk2", not as "disk2s2". 
>>>>>> Accordingly, OSX wanted to initialize the disk every time upon 
>>>>>> booting.
>>>>>
>>>>>> So I formatted the disk as described in the getting started 
>>>>>> guide on MacZFS, which resolves the problem of OSX wanting to 
>>>>>> initialize the disk, but still it shows as "disk2" (without the 
>>>>>> s2) with zpool status. I was prepared to resilver the disk again 
>>>>>> after that, but it was still part of the pool.
>>>>>
>>>>>> I started a scrub, had 6 checksum errors on that disk right at 
>>>>>> the beginning, but otherwise the scrub seems to consider the data 
>>>>>> as good. It is at 7 percent right now.
>>>>>
>>>>>> Should I be worried that the data is not integer?
>>>>>
>>>>> Yes, you should. 
>>>>>
>>>>> You basically did the following: 
>>>>>
>>>>> 1) 
>>>>>
>>>>> Gave a whole disk to ZFS, telling it, it is OK to use the whole 
>>>>> space from first to last block of the disk. 
>>>>>
>>>>> ZFS did so and started writing data: 
>>>>>
>>>>> a) it's vdev label 0,1 from block 0 to 1023 (assuming 512 byte 
>>>>> blocks) 
>>>>>
>>>>> b) it's vdev label 2,3 from block N-1024 to N-1 (assuming N block 
>>>>> on disk) 
>>>>>
>>>>> c) your pool data in between, following it's somewhat complex 
>>>>> allocation scheme 
>>>>>
>>>>>
>>>>> 2) 
>>>>>
>>>>> Told OS X to write a disk label (aka GPT) on the disk. 
>>>>>
>>>>> OS X did so and started writing data: 
>>>>>
>>>>> a) A protective MBR in block 0 -> no damage, ZFS anticipates that, 
>>>>> leaving block 0 to 32 (16k) of its label alone. 
>>>>>
>>>>> b) The primary GPT structures, starting from block 1 (byte 
>>>>> position 512) to end of block 33 (byte position 17408). This 
>>>>> trashed part of the configuration dictionary in vdev label 0 
>>>>>
>>>>> c) The secondary GPT structures, in the last 17408 bytes of the 
>>>>> disk, overwriting part of the uberblock array in vdev label 3. 
>>>>>
>>>>> d) The Mac OS X EFI area, usually around block 40 to 409600 (byte 
>>>>> positions up to 200 MB). This is "/dev/diskXs1". 
>>>>>
>>>>> e) The man partition "/dev/diskXs2", roughly starting at block 
>>>>> 409640 and extending until some blocks before the secondary GPT 
>>>>> structures. This is just created but nor written in "noformat" has 
>>>>> been used. 
>>>>>
>>>>>
>>>>>
>>>>> What does this mean? -------------------- 
>>>>>
>>>>>
>>>>> It depends on how ZFS sees the disk.  Most likely it will continue 
>>>>> to use "diskX" (no slice).  In that case: 
>>>>>
>>>>> The pool keeps functioning, since vdev labels 1 and 2 are undamaged 
>>>>> (0 and 3 are overwritten, see above) 
>>>>>
>>>>> ZFS will almost instantly fix it's labels, completely overwriting 
>>>>> the secondary GPT.  Mac OS X doesn't care, it writes the secondary 
>>>>> GPT and never looks there again. 
>>>>>
>>>>> The situation on the is start is more complex. 
>>>>>
>>>>> ZFS will also almost instantly fix its label 0. However, this 
>>>>> writes only from block 32 on (byte position 16384 onwards), since 
>>>>> it completely ignores the first 16 blocks (supposed to hold disk 
>>>>> identifier) and doesn't touch the next 16 in normal operation, 
>>>>> since they are supposed to hold ZFS boot code and are unused in 
>>>>> current implementations. 
>>>>>
>>>>> So the rewritten vdev label 0 trashes the last 512 bytes of the 
>>>>> primary GPT.  This does concern Mac OS X and you should see a 
>>>>> waring about an invalid GPT CRC in the system log after boot. 
>>>>>
>>>>>
>>>>> So much for the administrative data structures.  What about your 
>>>>> data? 
>>>>>
>>>>> ZFS' data area starts after the vdev label 1, i.e. at block 1024 
>>>>> (byte position 512 kB).  This is somewhere inside the EFI area, 
>>>>> overwriting whatever Mac OS X placed there (depends on version, 
>>>>> older Mac OS X version didn't placed anything there, don't know for 
>>>>> newer versions).  In any case, Mac OS X does not access the EFI 
>>>>> area in normal operation, and so won't note the damage. 
>>>>>
>>>>> On the other hand, Mac OS X is initializing the EFI area when 
>>>>> initializing a disk, placing an empty FAT file system there. 
>>>>>
>>>>> This FAT overwrites part of the ZFS pool data and caused the 
>>>>> checksum errors. 
>>>>>
>>>>>
>>>>> What to do now? --------------- 
>>>>>
>>>>> I would detach the disk in question, zap the first and last several 
>>>>> MB of disk space (i.e. of diskX itself, not of the diskX2s slice) 
>>>>> by writing zero bytes to disk, for example using "dd", reformat 
>>>>> with diskutil and reattach as /dev/diskX2s. 
>>>>>
>>>>> Another approach for zapping the disk content is, to format as 
>>>>> HFS+ with diskutil and then select "clear/erase free disk space" 
>>>>> (or whatever the English button label says). 
>>>>>
>>>>>
>>>>> Best regards 
>>>>>
>>>>> Björn 
>>>>>
>>>>>> On 23.10.14 14:01, 'Busty' via zfs-macos wrote:
>>>>>
>>>>>>> This was in fact easier than I thought. What did the trick was 
>>>>>>> to physically swap the faulty disk with a new one and then 
>>>>>>> "zpool detach (faulty disk)" 
>>>>>>>
>>>>>>> After that a "zpool replace" went like a charm. 
>>>>>>>
>>>>>>> Problem solved. 
>>>>>>>
>>>>>>>> On 15.10.14 20:32, 'Busty' via zfs-macos wrote: 
>>>>>>>> In my pool, I had a disk that got a smart error (bad block), 
>>>>>>>> so I pulled it out, installed a new one and made a "zpool 
>>>>>>>> replace disk5s2 806745480046791602". (That number was shown 
>>>>>>>> when typing "zpool status" as the missing device.) 
>>>>>>>>
>>>>>>>> The resilver process started, but it seems that the new disk 
>>>>>>>> is faulty, because it disappears from the device list 
>>>>>>>> infrequently, but still at least every 6 hours (I have 
>>>>>>>> Temperature Monitor running which shows me all disks by 
>>>>>>>> serial number). 
>>>>>>>>
>>>>>>>> So I want to change it. But zpool detach <poolname> 
>>>>>>>> dev/disk5s2 gives the error "no such device in pool". 
>>>>>>>>
>>>>>>>> How can I abort the resilvering process? Or is there another 
>>>>>>>> way to restart the resilvering with a new disk? 
>>>>>>>>
>>>>>>>> The original disk with the bad block is already on its way 
>>>>>>>> to Western Digital (it was still in warranty).
>>
>> -- 
>>
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "zfs-macos" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to zfs-macos+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
> 

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to