On Sun, 15 Jan 2012, Jim Klimov wrote:

It does seem possible that in-memory corruption of data payload
and/or checksum of a block before writing it to disk would render
it invalid on read (data doesn't match checksum, ZFS returns EIO) .
Maybe even worse if the in-memory block is corrupted before the
checksumming, and seemingly valid garbage gets stored on disk,
read afterwards, and used with blind trust.

Please don't under-state the actual issue. ZFS assumes that RAM is 100% reliable. ZFS uses an in-memory cache called the ARC which can span many tens of gigabytes on busy large memory systems. User data is stored in this ARC and the cached data becomes the reference copy of the data until it is evicted. This means that user data can be silently and undetectably corrupted due to memory corruption. The effects that zfs's checksums can detect are just a small subset of the problems which may occur if memory returns wrong values.

In all these cases RAM is the SPOF (single point of failure)
so all ZFS recommendations involve using ECC systems. Alas,
even though ECC chips and chipsets are cheap nowadays, not all
architectures use them anyway (i.e. desktops, laptops, etc.),
and the tagline of running ZFS for "reliable storage on consumer
grade hardware" is poisoned by this fact. Other filesystems

Feel free to blame Intel for this since they seem to be primarily responsible for delivering CPUs and chipsets which don't support ECC. AMD has not been such a perpetrator, although it is possible to buy AMD-based systems which don't provide ECC.

I do wonder, however, if it is possible to make a software ECC
to detect-and/or-repair small memory corruptions on consumer
grade systems. And where would such part fit - in ZFS (i.e.

This could be done for part of the memory but it would obviously result in huge performance loss. I/O to memory would have to become block-oriented rather than random access. It is still necessary for random access to be used in a large part of the memory since it is a requirement in order to run programs and there would no way to defend that part of the memory.

some ECC bits appended in every zfs_*_t structure) or in the
{Solaris} kernel for general VM management. And even then
there's a question whether this would solve more problems than
create a greater one - pose the visibility of solution and
hide problems that actually exist (because there would be
some non-ECC parts of the data path and GIGO principle can
apply at any point). In the bad case, you ECC an invalid
piece of memory, and afterwards trust it as it matches the
checksum. On the good side, there is a smaller window that
data is exposed unprotected, so statistically this solution
should help.

The problem is that with unreliable memory, the software-based ECC would not be able to correct the content of the memory since the ECC itself might have been computed incorrectly (due to unreliable memory). You are then faced with notifications of problems that the user can't fix.

The proper solution (regardless of filesystem used) is to assure that ECC is included in any computer that you buy.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to