I'm sure it's very hard to write good error handling code for hardware events like this.
I think, after skimming this thread (a pretty wild ride), we can at least decide that there is an RFE for a recovery tool for zfs - something to allow us to try to pull data from a failed pool. That seems like a reasonable tool to request/work on, no? On Thu, Feb 12, 2009 at 6:03 PM, Toby Thain <t...@telegraphics.com.au> wrote: > > On 12-Feb-09, at 3:02 PM, Tim wrote: > > > On Thu, Feb 12, 2009 at 11:31 AM, David Dyer-Bennet <d...@dd-b.net> wrote: >> >> On Thu, February 12, 2009 10:10, Ross wrote: >> >> > Of course, that does assume that devices are being truthful when they >> > say >> > that data has been committed, but a little data loss from badly designed >> > hardware is I feel acceptable, so long as ZFS can have a go at >> > recovering >> > corrupted pools when it does happen, instead of giving up completely >> > like >> > it does now. >> >> Well; not "acceptable" as such. But I'd agree it's outside ZFS's purview. >> The blame for data lost due to hardware actively lying and not working to >> spec goes to the hardware vendor, not to ZFS. >> >> If ZFS could easily and reliably warn about such hardware I'd want it to, >> but the consensus seems to be that we don't have a reliable qualification >> procedure. In terms of upselling people to a Sun storage solution, having >> ZFS diagnose problems with their cheap hardware early is clearly desirable >> :-). >> > > > Right, well I can't imagine it's impossible to write a small app that can > test whether or not drives are honoring correctly by issuing a commit and > immediately reading back to see if it was indeed committed or not. > > You do realise that this is not as easy as it looks? :) For one thing, the > drive will simply serve the read from cache. > It's hard to imagine a test that doesn't involve literally pulling plugs; > even better, a purpose built hardware test harness. > Nonetheless I hope that someone comes up with a brilliant test. But if the > ZFS team hasn't found one yet... it looks grim :) > --Toby > > Like a "zfs test cXtX". Of course, then you can't just blame the hardware > everytime something in zfs breaks ;) > > --Tim > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss