On Wed, Apr 15, 2009 at 10:32:13PM +0800, Uwe Dippel wrote: > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected. ... > errors: No known data errors > > Now I wonder where that error came from. It was just a single checksum
Hmmm, had ~ 2 weeks ago also a curious thing with an StorEdge 3510 (2x2Gbps FC MP, 1 Controller, 2x6HDDs mirrored and exported as a single device, no ZIL etc. tricks) connected to a X4600: Since grill party time has started, the 3510 decided at a room temp of 33°C to go "offline" and take part on the party ;-). Result was that during the offline time everything blocked (i.e. didn't got timeout or error), which tried to access a ZFS on that pool - wrt. the POV more or less expected. After the 3510 came back, a 'zpool status ..' showed something like this: NAME STATE READ WRITE CKSUM pool2 FAULTED 289K 4.03M 0 c4t600C0FF000000000099C790E0144EC00d0 FAULTED 289K 4.03M 0 too many errors errors: Permanent errors have been detected in the following files: pool2/home/stud/inf/foobar:<0x0> Still everything was blocking. After a 'zpool clear' all ZFS ( ~ 2300 on that pool) expect the listed one were accessable, but the status message kept unchanged. Curious, thought that blocking/waiting for the device to come back and the ZFS transaction stuff is actually made for a situation like this, aka "re-commit" un-ACKed actions ... Anyway, finally scrubbing the pool brought it back to normal ONLINE state without any errors. To be sure I compared the ZFS in question with the backup from some hours ago - no difference. So same question made in the subject. BTW: Some days later we had an even bigger grill party (~ 38°C) - this time the X4xxx machines in this room decided to go offline and take part as well (v4xx's kept running ;-)). So first the 3510 and some time later the X4600. This time the pool was after going back online in DEGRADED state, had some more errors like the above one and: <metadata>:<0x103> <metadata>:<0x4007> ... Clearing and scrubbing it brought it again back to normal ONLINE state without any errors. Spot check on the noted files with errors showed no damage ... Everything nice (wrt. data loss), but curious ... Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss