----Original Message-----
From: Frank Van Damme
Sent: Friday, May 20, 2011 6:25 AM

>Op 20-05-11 01:17, Chris Forgeron schreef:
>> I ended up switching back to FreeBSD after using Solaris for some time 
>> because I was getting tired of weird pool corruptions and the like.
>
>Did you ever manage to recover the data you blogged about on Sunday, February 
>6, 2011?

Oh yes, I didn't follow up on that. I'll have to that now.. here's the recap. 

Yes, I did get most of it back, thanks to a lot of effort from George Wilson 
(great guy, and I'm very indebted to him) .  However, any data that was in play 
at the time of the fault was irreversibly damaged and couldn't be restored. Any 
data that wasn't active at the time of the crash was perfectly fine, it just 
needed to be copied out of the pool into a new pool. George had to mount my 
pool for me, as it was beyond non-ZFS-programmer skills to mount. Unfortunately 
Solaris would dump after about 24 hours, requiring a second mounting by George. 
It was also slower than cold molasses to copy anything in it's faulted state. 
If I was getting 1 Meg/Sec, I was lucky. You can imaging that creates an issue 
when you're trying to evacuate a few TB of data through a slow pipe like that. 

After it dumped again, I didn't bother George for a third remounting (or I 
tried very half-heartedly, the guy was already into this for a lot of time, and 
we all have our day jobs), and abandoned the data that was still stranded on 
the faulted pool. I copied my most wanted data first, so what I abandoned was a 
personal collection of movies that I could always re-rip. 


I was still experimenting with ZFS at the time, so I wasn't using snapshots for 
backup, just conventional image backups of the VM's that were running.  
Snapshots would have had a good chance of protecting my data from the fault 
that I ran into. 


I was originally blaming my Areca 1880 card, as I was working with Areca tech 
support on a more stable driver for Solaris, and was on the 3rd revision of a 
driver with them. However, in the end it wasn't the Areca, as I was very 
familiar with it's tricks - The Areca would hang (about once every day or two), 
but it wouldn't take out the pool.  After removing the Arcea and going with 
just LSI 2008 based controllers,  I had one final fault  about 3 weeks later 
that corrupted another pool (luckily it was just a backup pool). At that point, 
the swearing in the server room reached a peak, I booted back into FreeBSD, and 
haven't looked back.  Originally when I used the Areca controller with FreeBSD, 
I didn't have any problems for about 2 months. 

I've had only small FreeBSD issues since then, nothing else has changed on my 
hardware. So the only claim I can make is that in my environment, on my 
hardware, I've had better stability with FreeBSD. 

One of the speed slow-downs with FreeBSD from my comparison tests was the 
O_SYNC method that ESX uses to mount a NFS store. I edited the FreeBSD NFS 
source to always do a async write, regardless of the O_SYNC from the client, 
and that perked FreeBSD up a lot for speed, making it fairly close to what I 
was getting on Solaris.  FreeBSD is now using a 4.1 NFS server by default as of 
the last month, and I'm just starting my stability tests with using a new 
FreeBSD-9 build to see if I can run newer code. I'll do speed tests again, and 
will probably make the same hack to the 4.1 NFS code to force async writes.  
I'll post to my blog and the FreeBSD lists when that occurs, as it's out of 
scope for this list. 

I do like Solaris - After some initial discomfort about the different way 
things were being done, I do see the overall design and idea, and I now have a 
wish list of features I'd like see ported to FreeBSD. I think I'll have a 
Solaris based box setup again for testing.  We'll see what time allows. 
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to