On Thu, 22 Dec 2011, Gareth de Vaux wrote:
On Thu 2011-12-22 (10:09), Bob Friesenhahn wrote:
One of your disks failed to return a sector. Due to redundancy, the
original data was recreated from the remaining disks. This is normal
good behavior (other than the disk failing to read the sector).
So those checksum counts were historical?
Yes. When a problem is detected and there is enough redundancy to
resolve the problem, then the bad data block is not used any more and
the corrected data is relocated somewhere else on the drive (I not
sure if zfs does this, or if it requests that drive firmware do this).
The count reflects that problems were found but does not reflect if a
correction was made. Other text describes any continuing issues which
I did a scrub and what worries me is that it came back with 0 issues
when clearly there were considering what happens when I kick 1 disk
Zero issues seems like a good thing. Resilvering the disk in the pool
performed most of the functions that scrub does so it should not
surprise that there are no more issues remaining.
Similarly I've seen that 'zpool clear' just sets you up for problems
down the line. It just pretends there aren't errors.
As far as I am aware, the data cleared by 'zpool clear' is for
administrators to confirm they are aware the issue occured. A good
administrator will consider any implications. The decision made for a
high capacity SATA drive should likely be different than that made for
a low-capacity enterprise SAS drive. Studies by Google suggest that
SATA drives will experience many more block errors than enterprise SAS
drives, and that higher error rates should be allowed from SATA drives
than SAS drives when considering to replace the disk. Drives which
experience continually more block failures are doomed to fail.
GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
zfs-discuss mailing list