Hello Rustam,

Saturday, May 3, 2008, 9:16:41 AM, you wrote:

R> I don't think that this is hardware issue, however i don't except this. I'll 
try to explain why.

R> 1. I've replaced all memory modules which are more likely to cause such a 
problem.

R> 2. There are many different applications running on that server
R> (Apache, PostgreSQL, etc.). However, if you look at the four
R> different crash dump stack traces you see the same picture:

R> ------ crash dump st1 ------
R> mutex_enter+0xb()
R> zio_buf_alloc+0x1a()
R> zio_read+0xba()
R> spa_scrub_io_start+0xf1()
R> spa_scrub_cb+0x13d()

R> ------ crash dump st2 ------
R> mutex_enter+0xb()
R> zio_buf_alloc+0x1a()
R> zio_read+0xba()
R> arc_read+0x3cc()
R> dbuf_prefetch+0x11d()
R> dmu_prefetch+0x107()
R> zfs_readdir+0x408()
R> fop_readdir+0x34()

R> ------ crash dump st3 ------
R> mutex_enter+0xb()
R> zio_buf_alloc+0x1a()
R> zio_read+0xba()
R> arc_read+0x3cc()
R> dbuf_prefetch+0x11d()
R> dmu_prefetch+0x107()
R> zfs_readdir+0x408()
R> fop_readdir+0x34()

R> ------ crash dump st4 ------
R> mutex_enter+0xb()
R> zio_buf_alloc+0x1a()
R> zio_read+0xba()
R> arc_read+0x3cc()
R> dbuf_prefetch+0x11d()
R> dmu_prefetch+0x107()
R> zfs_readdir+0x408()
R> fop_readdir+0x34()


R> All four crash dumps show problem at zio_read/zio_buf_alloc. Three
R> of these appeared during metadata prefetch (dmu_prefetch) and one
R> during scrubbing. I don't think that it's coincidence. IMHO,
R> checksum errors are the result of this inconsistency.

Which would happen if you have problem with HW and you're getting
wring checksums on both side of your mirrors. Maybe PS?

Try memtest anyway or sunvts



-- 
Best regards,
 Robert Milkowski                            mailto:[EMAIL PROTECTED]
                                       http://milek.blogspot.com

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to