2012-01-15 19:38, Edward Ned Harvey wrote: >> 1) How does raidzN protect agaist bit-rot without known full >> death of a component disk, if it at all does? > zfs can read disks 1,2,3,4... Then read disks 1,2,3,5... > Then read disks 1,2,4,5... ZFS can figure out which disk > returned the faulty data, UNLESS the disk actually returns > correct data upon subsequent retries.
Makes sense, if ZFS does actually do that ;) Counter-examples: 1) For several scrubs in a row, my pool consistently found two vdev errors and one pool error with zero per-disk errors (further leading to error in some object <metadata>:<0x0>). If the disk-read errors were transient, sometimes returning correct data (i.e. bad sector relocation was successful in the background), ZFS would receive good blocks on further scrubs - shouldn't it? 2) Even with one bad sector consistently in place, if ZFS can deduce correct original block data, why report the error at all (especially - for many times) as uncorrectable? This leaves me thinking of two on-disk errors, and/or lack of checksums for leaf blocks, as the possible reasons for such detected raidz errors with undetected faulty individual disks. Any other options I overlooked?
You know the open-source question in regards to ZFS is pretty much concluded, right? What oracle called zpool version 28 was the last open source version, currently in use on nexenta, freebsd, and some others. The illumos project has continued development, minimally. If you think the development effort is resource limited in oracle working on zfs, just try the open source illumos community...
I do try it. I do also see some companies like Nexenta or Joyent having discussed the NetApp problem and having moved on betting on their work with opensourced ZFS. Also, Oracle's closed ZFS is actually of little relevance to me or other SOHO users (laptops, home NASes, etc.) As Oracle doesn't deal with small customers, and people still have problems buying or getting support for small-volume stuff, or find Oracle's offerings prohibitively expensive, it is hard to get Oracle noticing a bug/RFE report not backed by money. There is nothing inherently bad with the business model, Sun also had it (while being more open to suggestions). It's just that in this model SOHO users have no influence on ZFS and it becomes a closed proprietary gadget like any other FS, without engineering interest to enhance it. And this couples with limited understanding whether you have a right to use it at all and not get sued by Oracle (i.e. for trying to put Solaris 11 in your production without paying the tax). Over the past year I have proposed or discussed a number of features for ZFS, and while there is little chance that illumos developers would implement any of that soon, there is near-zero chance that Oracle ever will. And there is a greater chance that myself or some other developer would dig into such RFEs and publish a solution - especially if such developer is helped with theory. //Jim _______________________________________________ zfs-discuss mailing list email@example.com http://mail.opensolaris.org/mailman/listinfo/zfs-discuss