2012-05-12 5:50, Robert Milkowski wrote:
What conditions can cause the reset of the resilvering process? My
lost-and-found disk can't get back into the pool because of resilvers

Well, for the night I rebooted the machine into single-user mode, to rule out
zones, crontabs and networked abusers, but I still get resilvering resets every
now and then, about once an hour.

I'm now trying a run with all zfs datasets unmounted, hope that helps
somewhat... I'm growing puzzled now.

To double check that no snapshots, etc. are being created run: zpool history 
-il pond

Hmmm, I should have thought of that - but didn't.
Anyhow, the pool history reports many "internal scrub"
starting and ending, with times quite matching the
resilver restarts I am seeing. Since the reboot into
single-user mode there were no other snapshots, but
while the system was fully up there were occasional
"internal snapshots" as well.

I'll post the last several lines, in hopes that someone
would make more sense of them.

Overall the applied question is whether the disk will
make it back into the live pool (ultimately with no
continuous resilvering), and how fast that can be done -
I don't want to risk the big pool with nonredundant
arrays for too long.

It has already taken 2 days to try and resilver a 250Gb
disk into the pool, but never made it past 100Gb progress. :(
Reports no errors that I'd see either... :)

2012-05-12.00:50:36 [internal pool scrub done txg:91070404] complete=0 [user root on thumper] 2012-05-12.00:50:38 [internal pool scrub txg:91070404] func=1 mintxg=41 maxtxg=91051854 [user root on thumper] 2012-05-12.01:19:00 [internal pool scrub done txg:91070477] complete=0 [user root on thumper] 2012-05-12.01:19:02 [internal pool scrub txg:91070477] func=1 mintxg=41 maxtxg=91051854 [user root on thumper] 2012-05-12.01:46:28 [internal pool scrub done txg:91070571] complete=0 [user root on thumper] 2012-05-12.01:46:30 [internal pool scrub txg:91070571] func=1 mintxg=41 maxtxg=91051854 [user root on thumper] 2012-05-12.02:45:52 [internal snapshot txg:91071279] dataset = 58650 [user daemon on thumper] 2012-05-12.02:45:56 [internal snapshot txg:91071280] dataset = 58652 [user jim on thumper] 2012-05-12.02:46:15 [internal snapshot txg:91071283] dataset = 58654 [user daemon on thumper] 2012-05-12.02:53:01 [internal pool scrub done txg:91071298] complete=0 [user root on thumper] 2012-05-12.02:53:03 [internal pool scrub txg:91071298] func=1 mintxg=41 maxtxg=91051854 [user root on thumper]

System rebooted into single-user around 3:03am; subsequent entries
seem to be only about resilvers of different lengths - if that is
what "internal scrubs" are:

2012-05-12.03:06:09 [internal pool scrub done txg:91071322] complete=0 [user root on thumper] 2012-05-12.03:06:11 [internal pool scrub txg:91071322] func=1 mintxg=41 maxtxg=91051854 [user root on thumper] 2012-05-12.04:09:35 [internal pool scrub done txg:91071448] complete=0 [user root on thumper] 2012-05-12.04:09:37 [internal pool scrub txg:91071448] func=1 mintxg=41 maxtxg=91051854 [user root on thumper] 2012-05-12.05:42:43 [internal pool scrub done txg:91071631] complete=0 [user root on thumper] 2012-05-12.05:42:45 [internal pool scrub txg:91071631] func=1 mintxg=41 maxtxg=91051854 [user root on thumper]
