I was able to destroy ZFS pools by trying to access them from inside 
VirtualBox. Until I read the detailed documentation, and set the disk buffer 
options correctly. I will dig into my notes and post the key setting to this 
thread when I find it.

But I've used ZFS for many years without ECC RAM with no trouble. It isn't the 
best way to,go, but it isn't the lack of ECC that's killing a ZFS pool. It's 
the hypervisor hardware emulation and buffering.

Sent from my iPad

> On Apr 1, 2014, at 5:24 PM, Jason Belec <jasonbe...@belecmartin.com> wrote:
> 
> I think Bayard has hit on some very interesting points, part of what I was 
> alluding to, but very well presented here. 
> 
> Jason
> Sent from my iPhone 5S
> 
>> On Apr 1, 2014, at 7:14 PM, Bayard Bell <buffer.g.overf...@gmail.com> wrote:
>> 
>> Could you explain how you're using VirtualBox and why you'd use a type 2 
>> hypervisor in this context?
>> 
>> Here's a scenario where you really have to mind with hypervisors: ZFS tells 
>> a virtualised controller that it needs to sync a buffer, and the controller 
>> tells ZFS that all's well while perhaps requesting an async flush. ZFS 
>> thinks it's done all the I/Os to roll a TXG to stable storage, but in the 
>> mean time something else crashes and whoosh go your buffers.
>> 
>> I'm not sure it's come across particularly well in this thread, but ZFS 
>> doesn't and can't cope with hardware that's so unreliable that it tells lies 
>> about basic things, like whether your writes have made it to stable storage, 
>> or doesn't mind the shop, as is the case with non-ECC memory. It's one thing 
>> when you have a device reading back something that doesn't match the 
>> checksum, but it gets uglier when you've got a single I/O path and a 
>> controller that seems to write the wrong bits in stride (I've seen this) or 
>> when the problems are even closer to home (and again I emphasise RAM). You 
>> may not have problems right away. You may have problems where you can't tell 
>> the difference, like flipping bits in data buffers that have no other 
>> integrity checks. But you can run into complex failure scenarios where ZFS 
>> has to cash in on guarantees that were rather more approximate than what it 
>> was told, and then it may not be a case of having some bits flipped in 
>> photos or MP3s but no longer being able to import your pool or having 
>> someone who knows how to operate zdb do some additional TXG rollback to get 
>> your data back after losing some updates.
>> 
>> I don't know if you're running ZFS in a VM or running VMs on top of ZFS, but 
>> either way, you probably want to Google for "data loss" "VirtualBox" and 
>> whatever device you're emulating and see whether there are known issues. You 
>> can find issue reports out there on VirtualBox data loss, but working 
>> through bug reports can be challenging.
>> 
>> Cheers,
>> Bayard
>> 
>>> On 1 April 2014 16:34, Eric Jaw <naisa...@gmail.com> wrote:
>>> 
>>> 
>>>> On Tuesday, April 1, 2014 7:04:39 AM UTC-4, jasonbelec wrote:
>>>> ZFS is lots of parts, in most cases lots of cheap unreliable parts, 
>>>> refurbished parts, yadda yadda, as posted on this thread and many, many 
>>>> others, any issues are probably not ZFS but the parts of the whole. Yes, 
>>>> it could be ZFS, after you confirm that all the parts ate pristine, maybe. 
>>> 
>>> 
>>> I don't think it's ZFS. ZFS is pretty solid. In my specific case, I'm 
>>> trying to figure out why VirtualBox is creating these issues. I'm pretty 
>>> sure that's the root cause, but I don't know why yet. So I'm just 
>>> speculating at this point. Of course, I want to get my ZFS up and running 
>>> so I can move on to what I really need to do, so it's easy to jump on a 
>>> conclusion about something that I haven't thought of in my position. Hope 
>>> you can understand
>>>  
>>>> 
>>>> My oldest system running ZFS is an Mac Mini Intel Core Duo with 3GB RAM 
>>>> (not ECC) it is the home server for music, tv shows, movies, and some 
>>>> interim backups. The mini has been modded for ESATA and has 6 drives 
>>>> connected. The pool is 2 RaidZ of 3 mirrored with copies set at 2. Been 
>>>> running since ZFS was released from Apple builds. Lost 3 drives, 
>>>> eventually traced to a new cable that cracked at the connector which when 
>>>> hot enough expanded lifting 2 pins free of their connector counter parts 
>>>> resulting in errors. Visually almost impossible to see. I replaced port 
>>>> multipliers, Esata cards, RAM, mini's, power supply, reinstalled OS, 
>>>> reinstalled ZFS, restored ZFS data from backup, finally to find the bad 
>>>> connector end one because it was hot and felt 'funny'. 
>>>> 
>>>> Frustrating, yes, educational also. The happy news is, all the data was 
>>>> fine, wife would have torn me to shreds if photos were missing, music was 
>>>> corrupt, etc., etc.. And this was on the old out of date but stable ZFS 
>>>> version we Mac users have been hugging onto for dear life. YMMV
>>>> 
>>>> Never had RAM as the issue, here in the mad science lab across 10 rotating 
>>>> systems or in any client location - pick your decade. However I don't use 
>>>> cheap RAM either, and I only have 2 Systems requiring ECC currently that 
>>>> don't even connect to ZFS as they are both XServers with other lives.
>>>> 
>>>> 
>>>> --
>>>> Jason Belec
>>>> Sent from my iPad
>>>> 
>>>>> On Apr 1, 2014, at 12:13 AM, Daniel Becker <razz...@gmail.com> wrote:
>>>>> 
>>>> 
>>>>>> On Mar 31, 2014, at 7:41 PM, Eric Jaw <nais...@gmail.com> wrote:
>>>>>> 
>>>>>> I started using ZFS about a few weeks ago, so a lot of it is still new 
>>>>>> to me. I'm actually not completely certain about "proper procedure" for 
>>>>>> repairing a pool. I'm not sure if I'm supposed to clear the errors after 
>>>>>> the scrub, before or after (little things). I'm not sure if it even 
>>>>>> matters. When I restarted the VM, the checksum counts cleared on its own.
>>>>> 
>>>>> The counts are not maintained across reboots.
>>>>> 
>>>>> 
>>>>>> On the first scrub it repaired roughly 1.65MB. None on the second scub. 
>>>>>> Even after the scrub there were still 43 data errors. I was expecting 
>>>>>> they were going to go away.
>>>>>> 
>>>>>>> errors: 43 data errors, use '-v' for a list
>>>>> 
>>>>> What this means is that in these 43 cases, the system was not able to 
>>>>> correct the error (i.e., both drives in a mirror returned bad data).
>>>>> 
>>>>> 
>>>>>> This is an excellent question. They're in 'Normal' mode. I remember 
>>>>>> looking in to this before and decided normal mode should be fine. I 
>>>>>> might be wrong. So thanks for bringing this up. I'll have to check it 
>>>>>> out again.
>>>>> 
>>>>> The reason I was asking is that these symptoms would also be consistent 
>>>>> with something outside the VM writing to the disks behind the VM’s back; 
>>>>> that’s unlikely to happen accidentally with disk images, but raw disks 
>>>>> are visible to the host OS as such, so it may be as simple as Windows 
>>>>> deciding that it should initialize the “unformatted” (really, formatted 
>>>>> with an unknown filesystem) devices. Or it could be a raid controller 
>>>>> that stores its array metadata in the last sector of the array’s disks.
>>>>> 
>>>>> 
>>>>>> memtest86 and memtest86+ for 18 hours came out okay. I'm on my third 
>>>>>> scrub and the number or errors has remained at 43. Checksum errors 
>>>>>> continue to pile up as the pool is getting scrubbed.
>>>>>> 
>>>>>> I'm just as flustered about this. Thanks again for the input.
>>>>> 
>>>>> Given that you’re seeing a fairly large number of errors in your scrubs, 
>>>>> the fact that memtest86 doesn’t find anything at all very strongly 
>>>>> suggests that this is not actually a memory issue.
>>> 
>>> -- 
>>> 
>>> --- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "zfs-macos" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to zfs-macos+unsubscr...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>> 
>> -- 
>> 
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "zfs-macos" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to zfs-macos+unsubscr...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
> -- 
> 
> --- 
> You received this message because you are subscribed to the Google Groups 
> "zfs-macos" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to zfs-macos+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"zfs-macos" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to zfs-macos+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to