[zfs-discuss] RAM failure led to data corruption

Achim Wolpers Tue, 31 Jan 2012 11:15:15 -0800

Hi!

Today I encountered data corruption on two zfs pools due to a RAM failure in my 
OI box running on a dell T710. My rpool now looks like this (after reboot):


  pool: rpool
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
  scan: scrub repaired 0 in 1h1m with 1 errors on Tue Jan 31 19:59:50 2012
config:

        NAME                         STATE     READ WRITE CKSUM
        rpool                        ONLINE       0     0     1
          mirror-0                   ONLINE       0     0     2
            c4t50014EE10313DE5Dd0s0  ONLINE       0     0     2
            c4t50014EE158688073d0s0  ONLINE       0     0     2

errors: Permanent errors have been detected in the following files:

        //var/pkg/lost+found/var/lib/gdm-20120131T195638Z/core
        //usr/lib/svn/libsvn_delta-1.so.0.0.0
        //lib/libpam.so.1
        //usr/lib/libXtsol.so.1
        //usr/lib/gnome-settings-daemon
        //usr/ruby/1.8/lib/ruby/1.8/optparse.rb
        //usr/gnu/bin/rm
        //var/log/syslog
        //usr/lib/amd64/libpciaccess.so.0
        //usr/local/lib/libiconv.so.2.5.0
        /rpool/service/svn/dvg/db/revs/0/653
        /rpool/service/svn/privat/db/revs/0/451
        /rpool/service/svn/privat/db/revs/0/716
        /rpool/service/svn/privat/db/revs/1/1276
        /rpool/service/svn/privat/db/revs/0/377
        /rpool/service/svn/privat/db/revs/0/835
        /rpool/service/svn/privat/db/revs/0/364

I have 17 files that are permanently corrupted. The corruption of gdm/core was 
found while scrubbing the pool. All the other 16 files where displayed as 
corrupted after the pool fell in degraded state. I'm not sure if these files 
are really corrupted, though: I can access all these files and  e.g. 
/usr/gnu/bin/rm works with no faults. All files have the identical md5 sum 
compared the the corresponding files of a different box, also running the same 
version of OI.

How do I find out, if these files are corrupted? If they appear to be ok, how 
do I get rid of the errors?

How can two healthy pools get that messed up, when a RAM DIMM gets broken?

Achim

  

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] RAM failure led to data corruption

Reply via email to