Re: [zfs-discuss] Another paper

Eric Schrock Thu, 22 Feb 2007 10:55:29 -0800

On Thu, Feb 22, 2007 at 10:45:04AM -0800, Olaf Manczak wrote:
> 
> Obviously, scrubbing and correcting "hard" errors that result in
> ZFS checksum errors is very beneficial. However, it won't address the
> case of "soft" errors when the disk returns correct data but
> observes some problems reading it. There are at least two good reasons
> to pay attention to these "soft" errors:
> 
> a) Preemptive detection and rewriting of partially defective but
>    still correctable sectors may prevent future data loss. Thus,
>    it improves perceived reliability of disk drives, which is
>    especially important in the JBOD case (including a single-drive JBOD).


These types of soft errors will be logged, managed, and (eventually)
diagnosed by SCSI FMA work currently in development.  If the SCSI DE
diagnoses a disk as faulty, then the ZFS agent will be able to respond
appropriately.

> b) It is not uncommon for such successful reads of partially defective
>    media to happen only after several retries. It is somewhat unfortunate
>    that there is no simple way to tell the drive how many times to retry.
>    Firmware in ATA/SATA drives, used predominantly in single-disk PCs,
>    will typically do a heroic effort to retrieve the data. It will
>    make numerous attempts to reposition the actuator, recalibrate the
>    head current, etc. It can take up to 20-40 seconds! Such strategy
>    is reasonable for a desktop PC but in it happens in an busy
>    enterprise file server it results in a temporary availability loss
>    (the drive freezes for up 20-40 seconds every time you try to
>    read this sector). Also, this strategy does not make any sense if
>    a RAID group in which the drive participates has redundant data
>    elsewhere, which is why SCSI/FC drives give up after a few retries.
> 
> One can detect (and repair) such problematic areas on disk by monitoring
> the  SMART counters during scrubbing, and/or by monitoring physical
> read timings (looking for abnormally slow ones).

Solaris currently has a disk monitoring FMA module that is specific to
Thumper (x4500) and monitors only the most basic information (overtemp,
self-test fail, predictive failure).  I have separated this out into a
common FMA transport module which will bring this functionality to all
platforms (though support for SCSI devices will depend on the
aforementioned SCSI FMA portfolio).  This should be putback soon.
Future work could expand this beyond the simple indicators into more
detailed analysis of various counters.

All of this is really a common FMA problem, not ZFS-specific.  All that
is needed in ZFS is an agent actively responding to external diagnoses.
I am laying the groundwork for this as part of ongoing ZFS/FMA work
mentioned in other threads.  For more information on ongoing FMA work, I
recommend visiting the FMA discussion forum.

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Another paper

Reply via email to