I don't think that this is hardware issue, however i don't except this. I'll try to explain why.
1. I've replaced all memory modules which are more likely to cause such a problem. 2. There are many different applications running on that server (Apache, PostgreSQL, etc.). However, if you look at the four different crash dump stack traces you see the same picture: ------ crash dump st1 ------ mutex_enter+0xb() zio_buf_alloc+0x1a() zio_read+0xba() spa_scrub_io_start+0xf1() spa_scrub_cb+0x13d() ------ crash dump st2 ------ mutex_enter+0xb() zio_buf_alloc+0x1a() zio_read+0xba() arc_read+0x3cc() dbuf_prefetch+0x11d() dmu_prefetch+0x107() zfs_readdir+0x408() fop_readdir+0x34() ------ crash dump st3 ------ mutex_enter+0xb() zio_buf_alloc+0x1a() zio_read+0xba() arc_read+0x3cc() dbuf_prefetch+0x11d() dmu_prefetch+0x107() zfs_readdir+0x408() fop_readdir+0x34() ------ crash dump st4 ------ mutex_enter+0xb() zio_buf_alloc+0x1a() zio_read+0xba() arc_read+0x3cc() dbuf_prefetch+0x11d() dmu_prefetch+0x107() zfs_readdir+0x408() fop_readdir+0x34() All four crash dumps show problem at zio_read/zio_buf_alloc. Three of these appeared during metadata prefetch (dmu_prefetch) and one during scrubbing. I don't think that it's coincidence. IMHO, checksum errors are the result of this inconsistency. I tend to think that problem is in ZFS it exists even in the latest Solaris version (maybe OpenSolaris as well). > > Lots of CKSUM errors like you see is often indicative > of bad hardware. Run > memtest for 24-48 hours. > > -marc This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss