I'm sure it's very hard to write good error handling code for hardware
events like this.

I think, after skimming this thread (a pretty wild ride), we can at
least decide that there is an RFE for a recovery tool for zfs -
something to allow us to try to pull data from a failed pool.  That
seems like a reasonable tool to request/work on, no?


On Thu, Feb 12, 2009 at 6:03 PM, Toby Thain <t...@telegraphics.com.au> wrote:
>
> On 12-Feb-09, at 3:02 PM, Tim wrote:
>
>
> On Thu, Feb 12, 2009 at 11:31 AM, David Dyer-Bennet <d...@dd-b.net> wrote:
>>
>> On Thu, February 12, 2009 10:10, Ross wrote:
>>
>> > Of course, that does assume that devices are being truthful when they
>> > say
>> > that data has been committed, but a little data loss from badly designed
>> > hardware is I feel acceptable, so long as ZFS can have a go at
>> > recovering
>> > corrupted pools when it does happen, instead of giving up completely
>> > like
>> > it does now.
>>
>> Well; not "acceptable" as such.  But I'd agree it's outside ZFS's purview.
>>  The blame for data lost due to hardware actively lying and not working to
>> spec goes to the hardware vendor, not to ZFS.
>>
>> If ZFS could easily and reliably warn about such hardware I'd want it to,
>> but the consensus seems to be that we don't have a reliable qualification
>> procedure.  In terms of upselling people to a Sun storage solution, having
>> ZFS diagnose problems with their cheap hardware early is clearly desirable
>> :-).
>>
>
>
> Right, well I can't imagine it's impossible to write a small app that can
> test whether or not drives are honoring correctly by issuing a commit and
> immediately reading back to see if it was indeed committed or not.
>
> You do realise that this is not as easy as it looks? :) For one thing, the
> drive will simply serve the read from cache.
> It's hard to imagine a test that doesn't involve literally pulling plugs;
> even better, a purpose built hardware test harness.
> Nonetheless I hope that someone comes up with a brilliant test. But if the
> ZFS team hasn't found one yet... it looks grim :)
> --Toby
>
> Like a "zfs test cXtX".  Of course, then you can't just blame the hardware
> everytime something in zfs breaks ;)
>
> --Tim
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
>
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to