On Mon, Nov 05, 2007 at 02:22:12PM +0100, Pawel Jakub Dawidek wrote:
> Hi.
>
> I'm observing the following deadlock.
>
> One thread holds zfsvfs->z_hold_mtx[i] lock and waits for I/O:
[...]
> Another thread tries to finish I/O, but can't, because it is trying to
> acquire zfsvfs->z_hold_mtx[i] lock:
[...]
> As you can see, the SPA ZIO INTR thread wants to finish I/O requests, calls
> arc_write_done(). All in all we call vn_rele(), which may end up in calling
> zfs_zinactive() if this was the last vnode reference. zfs_zinactive() tries
> to acquire zfsvfs->z_hold_mtx[i] lock, but can't, because it is already held
> by the thread waiting for I/O. Neither I/O can be finished, nor lock can be
> unlocked.
>
> I started to see this deadlock after last integration, so it was introduced
> in more or less during last month. Any ideas how to solve it?
I was able to trigger this deadlock with older code as well.
It seems I fixed it be changing the zfs_get_done() function from:
void
zfs_get_done(dmu_buf_t *db, void *vzgd)
{
zgd_t *zgd = (zgd_t *)vzgd;
rl_t *rl = zgd->zgd_rl;
vnode_t *vp = ZTOV(rl->r_zp);
dmu_buf_rele(db, vzgd);
zfs_range_unlock(rl);
VN_RELE(vp);
zil_add_vdev(zgd->zgd_zilog, DVA_GET_VDEV(BP_IDENTITY(zgd->zgd_bp)));
kmem_free(zgd, sizeof (zgd_t));
}
To something like this:
void
zfs_get_done(dmu_buf_t *db, void *vzgd)
{
zgd_t *zgd = (zgd_t *)vzgd;
rl_t *rl = zgd->zgd_rl;
vnode_t *vp = ZTOV(rl->r_zp);
dmu_buf_rele(db, vzgd);
zfs_range_unlock(rl);
zgd->zgd_vnode = vp;
zil_add_vdev(zgd->zgd_zilog, DVA_GET_VDEV(BP_IDENTITY(zgd->zgd_bp)));
/*
* Put the vnode onto the list. Call VN_RELE() on this vnode from
* within another kernel thread later (also free zgd there).
*/
mtx_lock(&zgd_done_mtx);
TAILQ_INSERT_TAIL(&zgd_done, zgd, zgd_next);
mtx_unlock(&zgd_done_mtx);
wakeup(&zgd_done);
}
Which bascially means, that VN_RELE() will be called from another thread
later. This seems to work, but I changed the order of VN_RELE() and
zil_add_vdev() calls. Is that safe? Doing zil_add_vdev() also from this
additional thread didn't worked for me.
--
Pawel Jakub Dawidek http://www.wheel.pl
pjd at FreeBSD.org http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-code/attachments/20071107/e27a4697/attachment.bin>