Re: [zfs-discuss] How does resilver/scrub work?

Jim Klimov Wed, 23 May 2012 15:00:18 -0700

Thanks again,

2012-05-24 1:01, Richard Elling wrote:

At least the textual error message infers that if a hotspare
were available for the pool, it would kick in and invalidate
the device I am scrubbing to update into the pool after the
DD-phase (well, it was not DD but a hung-up resilver in this
case, but that is not substantial).


The man page is clear on this topic, IMHO


Indeed, even in snv_117 the zpool man page says that. But the
console/dmesg message was also quite clear, so go figure whom
to trust (or fear) more ;)

fmd: [ID 377184 daemon.error] SUNW-MSG-ID: ZFS-8000-GH, TYPE: Fault,VER: 1, SEVERITY: Major

EVENT-TIME: Wed May 16 03:27:31 MSK 2012
PLATFORM: Sun Fire X4500, CSN: 0804AMT023            , HOSTNAME: thumper
SOURCE: zfs-diagnosis, REV: 1.0
EVENT-ID: cc25a316-4018-4f13-c675-d1d84c6325c3
DESC: The number of checksum errors associated with a ZFS device

exceeded acceptable levels. Refer to http://sun.com/msg/ZFS-8000-GH formore information.

AUTO-RESPONSE: The device has been marked as degraded.  An attempt
will be made to activate a hot spare if available.
IMPACT: Fault tolerance of the pool may be compromised.
REC-ACTION: Run 'zpool status -x' and replace the bad device.

> dd, or simular dumb block copiers, should work fine.
> However, they are inefficient...

Define efficient? In terms of transferring the 900Gb payload
of a 1Tb HDD used for ZFS for a year - DD would beat resilver
anytime, in terms of getting most or (less likely) all of the
valid bits with data onto the new device. It is the next phase
(getting the rest of the bits into valid state) that needs
some attention, manual or automated.


speed != efficiency


Ummm... this is likely to start a flame war with other posters,
and you did not say what efficiency is to you? How can we compare
apples to meat, not even knowing whether the latter is a steak or
a pork knee?

I, for now, choose to stand by a statement that reduction of the
timeframe that the old disk needs to be in the system is a good
thing, as well as that changing the IO pattern from random writes
into (mostly) sequential writes and after that random reads may
be also somewhat more efficient, especially under other loads
(interfering less with them). Even though the whole replacement
process may take more wallclock time, there are cases when I'd
likely trust it to do a better job than original resilvering.

I think, someone with equipment could stage an experiment and
compare the two procedures (existing and proposed) on a nearly
full and somewhat fragmented pool.

Maybe you can disenchant me (not with vague phrases but either
theory or practice) and I would then see that my trust is blind,
misdirected and without basement. =)

IMHO, this is too operationally complex for most folks. KISS wins.


That's why I proposed to tuck this scenario under the zfs hood
(DD + selective scrub + ditto writes during the process,
as an optional alternative to current resilver), or explain
coherently why this should not be done - not for any situation.
Implementing it as a standard supported command would be KISS ;)

Especially if it is known that with some quirks this procedure
works, and may be beneficial to some cases, i.e. by reducing
the timeframe that a pool with a flaky disk in place is exposed
to potential loss of redundancy and large amounts of data, and
in the worst case the loss is constrained to those sectors
which couldn't be (correctly) read by DD from the source disk
and couldn't be reconstructed by raidz/mirror redundancies due
to whatever overlaying problems (i.e. a sector from same block
died on another disk too).

What is it about error counters that frightens you enough to want to clear
them often?


In this case, mostly, the fright of having the device kicked
out of the pool automatically instead of getting it "synced"
("resilvered" is an improper term here, I guess) to proper state.

In general - since this is a part of some migration procedure
which is, again, expected to have errors, we don't really care
for signalling them. Why doesn't the original resilver signal
several million CKSUM errors per new empty disk when it does
reconstruction of sectors onto it? I'd say this is functionally
identical. (At least, would be - if it were part of a supported
procedure as I suggest).

Thanks,
//Jim Klimov

PS: I pondered for a while if I should make up an argument that
on a dying disk mechanics, lots of random IO (resilver) instead
of sequential IO (DD) would cause it to die faster, but that's
just a FUD not backed by any scientific data or statistics -
which you likely have, and perhaps opposing this argument indeed.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How does resilver/scrub work?

Reply via email to