A few things to think about when reading that forum post: - The scenario described in that post is based on the assumption that all blocks read from disk somehow get funneled through a single memory location, which also happens to have a permanent fault. - In addition, it assumes that after a checksum failure, the corrected data either gets stored in the exact same memory location again, or in another memory location that also has a permanent fault. - It also completely ignores the fact that ZFS has an internal error threshold and will automatically offline a device once the number of read/checksum errors seen on it exceeds that threshold, preventing further corruption. ZFS will *not* go and happily mess up your entire pool. - This would *not* be silent; ZFS would report a large number of checksum errors on all your devices. - Blocks corrupted in that particular way would *not* actually spread to incremental backups or via rsync, as the corrupted blocks would not be seen as modified. - There is no indication that the reported cases of data loss that he points to are actually due to the particular failure mechanism described in the post; there are *lots* of other ways in which memory corruption can lead to a file system becoming unmountable, checksums or not. - Last but not leasts, note that “Cyberjock" is a community moderator, not somebody who’s actually in any way involved in the development of ZFS (or even FreeNAS; see the preface of his FreeNAS guide for some info on his background). If this were really as big of a risk as he thinks it is, you’d think somebody who is actually familiar with the internals of ZFS would have raised this concern before.
On Feb 26, 2014, at 5:56 PM, Philip Robar <philip.ro...@gmail.com> wrote: > Please note, I'm not trolling with this message. I worked in Sun's OS/Net > group and am a huge fan of ZFS. > > The leading members of the FreeNAS community make it clear  (with a > detailed explanation and links to reports of data loss) that if you use ZFS > without ECC RAM that there is a very good chance that you will eventually > experience a total loss of your data without any hope of recovery.  > (Unless you have literally thousands of dollars to spend on that recovery. > And even then there's no guarantee of said recovery.) The features of ZFS, > checksumming and scrubbing, work together to silently spread the damage done > by cosmic rays and/or bad memory throughout a file system and this corruption > then spreads to your backups. > > Given this, aren't the various ZFS communities--particularly those that are > small machine oriented --other than FreeNAS (and even they don't say it as > strongly enough in their docs), doing users a great disservice by implicitly > encouraging them to use ZFS w/o ECC RAM or on machines that can't use ECC RAM? > > As an indication of how persuaded I've been for the need of ECC RAM, I've > shut down my personal server and am not going to access that data until I've > built a new machine with ECC RAM. > > Phil > >  ECC vs non-ECC RAM and ZFS: > http://forums.freenas.org/index.php?threads/ecc-vs-non-ecc-ram-and-zfs.15449/ > >  cyberjock: "So when you read about how using ZFS is an "all or none" I'm > not just making this up. I'm really serious as it really does work that way. > ZFS either works great or doesn't work at all. That really truthfully [is] > how it works." > >  ZFS-macos, NAS4Free, PC-BSD, ZFS on Linux > > > -- > > --- > You received this message because you are subscribed to the Google Groups > "zfs-macos" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to zfs-macos+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/groups/opt_out.
Description: S/MIME cryptographic signature