On May 23, 2012, at 1:32 PM, Jim Klimov wrote:
> 2012-05-23 20:54, Richard Elling wrote:
>> comments far below...
> Thank you Richard for taking notice of this thread and the
> definitive answers I needed not quote below, for further
> questions ;)
>>> 2) How did you "treat errors as expected" during scrub?
>>> As I've discovered, there were hoops to jump through.
>>> Is there a switch to disable "degrading" of pools and
>>> TLVDEVs based on only the CKSUM counts?
>> DEGRADED is the status. You clear degraded states by fixing the problem
>> and running zpool clear. DEGRADED, in and of itself, is not a problem.
> Doesn't this status preclude the device with many CKSUM errors
> from participating in the pool (TLVDEV) and the remainder of
> the scrub in particular?
> At least the textual error message infers that if a hotspare
> were available for the pool, it would kick in and invalidate
> the device I am scrubbing to update into the pool after the
> DD-phase (well, it was not DD but a hung-up resilver in this
> case, but that is not substantial).
The man page is clear on this topic, IMHO
One or more top-level vdevs is in the degraded
state because one or more component devices are
offline. Sufficient replicas exist to continue
One or more component devices is in the degraded
or faulted state, but sufficient replicas exist
to continue functioning. The underlying condi-
tions are as follows:
o The number of checksum errors exceeds
acceptable levels and the device is
degraded as an indication that some-
thing may be wrong. ZFS continues to
use the device as necessary.
> Such automatic replacement is definitely not what I needed
> in this particular case, so if it were to happen - it would
> be a problem indeed, in and of itself.
> > dd, or simular dumb block copiers, should work fine.
> > However, they are inefficient...
> Define efficient? In terms of transferring the 900Gb payload
> of a 1Tb HDD used for ZFS for a year - DD would beat resilver
> anytime, in terms of getting most or (less likely) all of the
> valid bits with data onto the new device. It is the next phase
> (getting the rest of the bits into valid state) that needs
> some attention, manual or automated.
speed != efficiency
> Again, DD is not a good usecase indeed for pools with little
> data on big disks, and while I see why these could be used
> (i.e. to never face fragmentation), I haven't seen them in
> practice around here.
> >... and operationally difficult to manage
> Actually, that's why I asked whether it makes sense to
> automate such a scenario as another legal variant of disk
> replacement, complete with fast data transfer and verification
> and simultaneous work of the new and old devices until the
> data migration is marked complete. In particular that would
> take care of accepting the scrub errors as an expected part
> of the disk replacement and not a fatal fault/degradation,
> and/or allowing new writes to propagate onto the new disk
> while the replacement is going on and minimize discrepancies
> right on the run.
> In visible effect this would be similar to current resilver
> during replacement of a live disk with a hotspare, but the
> prcess would follow a different scenario I suggested earlier
> in the thread.
IMHO, this is too operationally complex for most folks. KISS wins.
>>> My raw hoop-jumping script:
>> I would never allow such scripts in my site. It is important to track the
>> progress and state changes. This script resets those counters for no
>> good reason.
>> I post this comment in the hope that future searches will not encourage
>> people to try such things.
> Understood, point taken, I won't try to promote such a "solution",
> and I agree that certainly it is not a good general idea indeed.
> It should be noted however (or I want to be corrected, please,
> if I am wrong), that:
> 1) Errors are expected on this run since the DD'ed copy is expected
> to deviate from current pool state; if the "degradation" mark of
> new disk would force it to be kicked out of the pool just because
> there are many CKSUM errors - which we know should be there due
> to manual DD-phase - then the reason is good IMHO (in this one
> 2) The progress is tracked by logging the error counts into a text
> file. If the admin fired up the script (manually in his terminal
> or a vnc/screen session), he can also look into the log file or
> even tail it.
> 3) The individual CKSUM errors are summed up in fmstat output, and
> this script does not zero them out, so even system-side tracking
> is not disturbed here.
> Anyhow, if there is a device with just a few CKSUM errors, then the
> next scrub clears its error counts anyway (if no new problems are
What is it about error counters that frightens you enough to want to clear
ZFS Performance and Training
zfs-discuss mailing list