-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Am 02.03.14 09:45, schrieb Philip Robar: > On Sun, Mar 2, 2014 at 12:46 AM, Jean-Yves Avenard > <jyaven...@gmail.com>wrote: >
[ cut a lot not relevant to my comment ] >> Bad RAM however has nothing to do with the occasional bit flip >> that would be prevented using ECC RAM. The probability of a bit >> flip is low, very low. >> > > You and Jason have both claimed this. This is at odds with papers > and studies I've seen mentioned elsewhere. Here's what a little > searching found: > > Soft Error: https://en.wikipedia.org/wiki/Soft_error Which says > that there are numerous sources of soft errors in memory and other > circuits other than cosmic rays. > > ECC Memory: https://en.wikipedia.org/wiki/ECC_memory States that > design has dealt with the problem of increased circuit density. It > then mentions the research IBM did years ago and Google's 2009 > report which says: > > The actual error rate found was several orders of magnitude higher > than previous small-scale or laboratory studies, with 25,000 to > 70,000 errors per billion device hours per mega*bit* (about 2.5-7 × > 10-11 error/bit·h)(i.e. about 5 single bit errors in 8 Gigabytes of > RAM per hour using the top-end error rate), and more than 8% of > DIMM memory modules affected by errors per year. Have you some *reliable* source for your claim in above paragraph? You say that an average 8 GB memory subsystem should experience 5 bit errors per *hour* of operation. On the other side you say (only) 8% of all DIMMs are affected per *year*. I *guess* (and might be wrong) that the majority of installed DIMMs nowadays are 2 GB DIMMs, so you need four of them to build 8 GB. Assuming equal distribution of bit errors, this means on average *every* DIMM will experience 1 bit error per hour. That doesn't fit. Today's all purpose PC's regularly ship with 8 GB of RAM, and modern, widely used operating systems, no matter which vendor, all make excessive use of any single bit of memory they can get. Non of these have any software means to protect RAM content, including FS caches, against bit rot. With 5 bit errors per hour these machines should be pretty unusable, corrupting documents all day and probably crashing applications and sometimes the OS repeatedly within a business day. Yet I am not aware of any reports that daily office computing ceased to be reliably usable over the last decade. So something doesn't fit here. Where is (my?) mistake in reasoning? Of course, this does not say anything about ZFS' vulnerability to RAM errors compared to other system parts. I'll come to that point in a different mail, but it will take a bit more time to write it up without spreading more uncertainty than already produced in this thread. Best regards Björn - -- | Bjoern Kahl +++ Siegburg +++ Germany | | "googlelogin@-my-domain-" +++ www.bjoern-kahl.de | | Languages: German, English, Ancient Latin (a bit :-)) | -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQCVAgUBUxMJAFsDv2ib9OLFAQKIyAQAmZBIryCnndv1FZleZ5JRlQpVMZ8N+TmB 3FYBMTFk9c8caC65Avv9cKsP7Fq5X2F3gRfTzo8f8Kk9evsnOGheksFPs8y14gsP AYTXz8B0rbZlfH/DQhV5JOYnEdeYXTwuN3Nso41CMER7EFpa6bEGSNiTiA8inbCr GjHQot2gTwc= =8fU0 -----END PGP SIGNATURE----- -- --- You received this message because you are subscribed to the Google Groups "zfs-macos" group. To unsubscribe from this group and stop receiving emails from it, send an email to zfs-macos+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.