On Fri, Dec 12, 2008 at 05:31:37PM -0500, Miles Nordin wrote: > nw> If you can fully trust the SAN then there's no reason not to > nw> run ZFS on top of it with no ZFS mirrors and no RAID-Z. > > The best practice I understood is currently to use zpool-layer > redundancy especially with SAN even moreso than with single-spindle
Yes, but I believe this whole thread is about ZFS with no zpool-layer redundancy, with RAID done in the SAN. > local storage, because of (1) the new corruption problems people are Your thesis is that all corruption problems observed with ZFS on SANs are: a) phantom writes that never reached the rotating rust, b) not bit rot, corruption in the I/O paths, ... Correct? > The problems do not sound like random bit-flips. They're corruption > of every ueberblock. The best-guess explanation AIUI, is not FC Some of the earlier problems of type (2) were triggered by checksum verification failures on pools with no redundancy, where ZFS would just panic (IIRC). These were due to bit-rot issues, not cache flush failures. > checksum gremlins---it's that write access to the SAN is lost and then > comes back---ex. if the SAN target loses power or fabric access but > the ZFS host doesn't reboot---and either the storage stack is > misreporting the failure or ZFS isn't correctly responding to the > errors. see the posts I referenced. It's possible that ZFS could, periodically (in the background) and/or at pool import time (synchronously), validate the consistency on disk of every transaction going backwards from the last until one is found that is consistent, or until it runs out of past überblocks, or it goes too far into the past. (Does ZFS have an option to do that? It might be a useful option to have for dealing with lying SANs.) > jh> ZFS' notorious instability during error conditions. > right, availability is a reason to use RAID below ZFS layer. It might ZFS handles device errors better when ZFS does redundancy at the zpool layer, as opposed to when redundancy is left to the SAN. That's well established, so why do you say the opposite? Nico -- _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss