Excellent. If you feel this is necessary go for it. Those that have systems 
that don't have ECC should just run like the sky is falling by your point view. 
That said, I can guarantee non of the systems I have under my care have issues. 
How do I know? Well the data is tested/compared at regular intervals. Maybe I'm 
the luckiest guy ever, where is that lottery ticket. Is ECC better, possibly, 
probably in heavy load environments, no data has been provided to back this up. 
Especially nothing in the context of what most users needs are at least here in 
the Mac space. Which ECC? Be specific. They are not all the same. Just like 
regular RAM are not all the same. Just like HDDs are not all the same. Fear 
mongering is wonderful and easy. Putting forth a solution guaranteed to be 
better is what's needed now. Did you actually reference a wiki? Seriously? A 
document anyone can edit to suit there view? I guess I come from a different 

Sent from my iPhone 5S

> On Apr 11, 2014, at 5:09 PM, Bayard Bell <buffer.g.overf...@gmail.com> wrote:
> If you want more of a smoking gun report on data corruption without ECC, try:
> https://blogs.oracle.com/vlad/entry/zfs_likes_to_have_ecc
> This view isn't isolated in terms of what people at Sun thought or what 
> people at Oracle now think. Trying googling for "zfs ecc 
> site:blogs.oracle.com", and you'll find a recurring statement that ECC should 
> be used even in home deployment, with maybe one odd exception.
> The Wikipedia article, correctly summarising the Google study, is plain in 
> saying not that extremely high error rates are common but that error rates 
> are highly variable in large-sample studies, with some systems seeing 
> extremely high error rates. ECC gives a significant assurance based on an 
> incremental cost, so what's your data worth? You're not guaranteed to be 
> screwed by not using ECC (and the Google paper doesn't say this either), but 
> you are assuming risks that ECC mitigates. Look at the above blog, however: 
> even DIMMs that are high-quality but non-ECC can go wrong and result in nasty 
> system corruption.
> What generally protects you in terms of pool integrity is metadata redundancy 
> on top of integrity checks, but if you flip bits on metadata in-core before 
> writing redundant copies, well, that's a risk to pool integrity.
> I also think it's mistaken to say this is distinctly a problem with ZFS. Any 
> "next-generation" filesystem that provides protections against on-disk 
> corruption via checksums ends up with a residual risk focus on making sure 
> that in-core data integrity is robust. You could well have those problems on 
> the pools you've deployed, and there are a lot of situations in you'd never 
> know and quite a lot (such as most of the bits in a photo or MP3) where you'd 
> never notice low rates of bit-flipping. The fact that you haven't noticed 
> doesn't equate to there being no problems in a strict sense, it's far more 
> likely that you've been able to tolerate the flipping that's happened. The 
> guy at Sun with the blog above got lucky: he was running high-quality non-ECC 
> RAM, and it went pear-shaped, at least for metadata cancer, quite quickly, 
> allowing him to recover by rolling back snapshots.
> Take a look out there, and you'll find people who are very confused about the 
> risks and available mitigations. I found someone saying that there's no 
> problem with more traditional RAID technologies because disks have CRCs. By 
> comparison, you can find Bonwick, educated as a statistician, talking about 
> SHA256 collisions by comparison to undetected ECC error rates and introducing 
> ZFS data integrity safeguards by way of analogy to ECC. That's why the 
> large-sample studies are interesting and useful: none of this technology 
> makes data corruption impossible, it just goes to extreme length to 
> marginalise the chances of those events by addressing known sources of errors 
> and fundamental error scenarios--in-core is so core that if you tolerate 
> error there, those errors will characterize systematic behaviour where you 
> have better outcomes reasonably available (and that's **reasonably** 
> available, I would suggest, in a way that the Madison paper's recommendation 
> to make ZFS buffers magical isn't). CRC-32 does a great job detecting bad 
> sectors and preventing them from being read back, but SHA256 in the right 
> place in a system detects errors that a well-conceived vdev topology will 
> generally make recoverable. That includes catching cases where an error isn't 
> caught by CRC-32, which may be a rare result, but when you've got the kind of 
> data densities that ZFS can allow, you're rolling the dice often enough that 
> those results become interesting.
> ECC is one of the most basic steps to take, and if you look at the 
> architectural literature, that's how it's treated. If you really want to be 
> in on the joke, find the opensolaris zfs list thread from 2009 where someone 
> asks about ECC, and someone else jumps in to remark on how VirtualBox can be 
> poison for pool integrity for reasons rehearsed in my last post.
> Cheers,
> Bayard
>> On 1 April 2014 12:04, Jason Belec <jasonbe...@belecmartin.com> wrote:
>> ZFS is lots of parts, in most cases lots of cheap unreliable parts, 
>> refurbished parts, yadda yadda, as posted on this thread and many, many 
>> others, any issues are probably not ZFS but the parts of the whole. Yes, it 
>> could be ZFS, after you confirm that all the parts ate pristine, maybe. 
>> My oldest system running ZFS is an Mac Mini Intel Core Duo with 3GB RAM (not 
>> ECC) it is the home server for music, tv shows, movies, and some interim 
>> backups. The mini has been modded for ESATA and has 6 drives connected. The 
>> pool is 2 RaidZ of 3 mirrored with copies set at 2. Been running since ZFS 
>> was released from Apple builds. Lost 3 drives, eventually traced to a new 
>> cable that cracked at the connector which when hot enough expanded lifting 2 
>> pins free of their connector counter parts resulting in errors. Visually 
>> almost impossible to see. I replaced port multipliers, Esata cards, RAM, 
>> mini's, power supply, reinstalled OS, reinstalled ZFS, restored ZFS data 
>> from backup, finally to find the bad connector end one because it was hot 
>> and felt 'funny'. 
>> Frustrating, yes, educational also. The happy news is, all the data was 
>> fine, wife would have torn me to shreds if photos were missing, music was 
>> corrupt, etc., etc.. And this was on the old out of date but stable ZFS 
>> version we Mac users have been hugging onto for dear life. YMMV
>> Never had RAM as the issue, here in the mad science lab across 10 rotating 
>> systems or in any client location - pick your decade. However I don't use 
>> cheap RAM either, and I only have 2 Systems requiring ECC currently that 
>> don't even connect to ZFS as they are both XServers with other lives.
>> --
>> Jason Belec
>> Sent from my iPad
>>> On Apr 1, 2014, at 12:13 AM, Daniel Becker <razzf...@gmail.com> wrote:
>>>> On Mar 31, 2014, at 7:41 PM, Eric Jaw <naisa...@gmail.com> wrote:
>>>> I started using ZFS about a few weeks ago, so a lot of it is still new to 
>>>> me. I'm actually not completely certain about "proper procedure" for 
>>>> repairing a pool. I'm not sure if I'm supposed to clear the errors after 
>>>> the scrub, before or after (little things). I'm not sure if it even 
>>>> matters. When I restarted the VM, the checksum counts cleared on its own.
>>> The counts are not maintained across reboots.
>>>> On the first scrub it repaired roughly 1.65MB. None on the second scub. 
>>>> Even after the scrub there were still 43 data errors. I was expecting they 
>>>> were going to go away.
>>>>> errors: 43 data errors, use '-v' for a list
>>> What this means is that in these 43 cases, the system was not able to 
>>> correct the error (i.e., both drives in a mirror returned bad data).
>>>> This is an excellent question. They're in 'Normal' mode. I remember 
>>>> looking in to this before and decided normal mode should be fine. I might 
>>>> be wrong. So thanks for bringing this up. I'll have to check it out again.
>>> The reason I was asking is that these symptoms would also be consistent 
>>> with something outside the VM writing to the disks behind the VM’s back; 
>>> that’s unlikely to happen accidentally with disk images, but raw disks are 
>>> visible to the host OS as such, so it may be as simple as Windows deciding 
>>> that it should initialize the “unformatted” (really, formatted with an 
>>> unknown filesystem) devices. Or it could be a raid controller that stores 
>>> its array metadata in the last sector of the array’s disks.
>>>> memtest86 and memtest86+ for 18 hours came out okay. I'm on my third scrub 
>>>> and the number or errors has remained at 43. Checksum errors continue to 
>>>> pile up as the pool is getting scrubbed.
>>>> I'm just as flustered about this. Thanks again for the input.
>>> Given that you’re seeing a fairly large number of errors in your scrubs, 
>>> the fact that memtest86 doesn’t find anything at all very strongly suggests 
>>> that this is not actually a memory issue.
>> -- 
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "zfs-macos" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to zfs-macos+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
> -- 
> --- 
> You received this message because you are subscribed to the Google Groups 
> "zfs-macos" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to zfs-macos+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.


You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to