Hello Marc, Tuesday, October 21, 2008, 8:14:17 AM, you wrote:
MB> About 2 years ago I used to run snv_55b with a raidz on top of 5 500GB SATA MB> drives. After 10 months I ran out of space and added a mirror of 2 250GB MB> drives to my pool with "zpool add". No pb. I scrubbed it weekly. I only saw 1 MB> CKSUM error one day (ZFS self-healed itself automatically of course). Never MB> had any pb with that server. MB> After running again out of space I replaced it with a new system running MB> snv_82, configured with a raidz on top of 7 750GB drives. To burn in the MB> machine, I wrote a python script that read random sectors from the drives. I MB> let it run for 48 hours to subject each disk to 10+ million I/O operations. MB> After it passed this test, I created the pool and run some more scripts to MB> create/delete files off it continously. To test disk failures (and SATA MB> hotplug), I disconnected and reconnected a drive at random while the scripts MB> were running. The system was always able to redetect the drive immediately MB> after being plugged in (you need "set sata:sata_auto_online=1" for this to MB> work). Depending on how long the drive had been disconnected, I either needed MB> to do a "zpool replace" or nothing at all, for the system to re-add the disk MB> to the pool and initiate a resilver. After these tests, I trusted the system MB> enough to move all my data to it, so I rsync'd everything and double-checked MB> it with MD5 sums. MB> I have another ZFS server, at work, on which 1 disk someday started acting MB> weirdly (timeouts). I physically replaced it, and ran "zpool replace". The MB> resilver completed successfully. On this server, we have seen 2 CKSUM errors MB> over the last 18 months or so. We read about 3 TB of data every day from it MB> (daily rsync), that amounts to about 1.5 PB over 18 months. I guess 2 silent MB> data corruptions while reading that quantity of data is about the expected MB> error rate of modern SATA drives. (Again ZFS self-healed itself, so this was MB> completely transparent to us.) Which means you haven't experienced silent data corruption thanks to ZFS. :) -- Best regards, Robert mailto:[EMAIL PROTECTED] http://milek.blogspot.com _______________________________________________ zfs-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
