Am Mon, 7 May 2018 17:19:46 +0200
schrieb Olaf Hering <o...@aepfle.de>:

> What I gathered during debugging so far is that somehow qemu on the receiving 
> side locks a region twice:

After further debugging with many wild printfs:
On the receiving side blockdev_init sets BDRV_O_INACTIVE because 
RUN_STATE_INMIGRATE is true.
BDRV_O_INACTIVE causes bdrv_is_writable to return false.
As a result bdrv_format_default_perms does not set BLK_PERM_WRITE in perms.

On the sending side offset 0xc9 is unlocked on the other fd, which allows 
F_WRLCK to succeed:
2018-05-08T11:20:54.491168Z qemu-system-i386: qemu_lock_fcntl: 28 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:20:54.492162Z qemu-system-i386: qemu_lock_fd_test: 28 c9 1 
F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:20:54.494752Z qemu-system-i386: qemu_lock_fcntl: 28 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.189455Z qemu-system-i386: qemu_lock_fcntl: 28 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.190460Z qemu-system-i386: qemu_lock_fd_test: 28 c9 1 
F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:21:05.192726Z qemu-system-i386: qemu_lock_fcntl: 28 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.194298Z qemu-system-i386: qemu_lock_fcntl: 28 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.195079Z qemu-system-i386: qemu_lock_fd_test: 28 c9 1 
F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:21:05.197123Z qemu-system-i386: qemu_lock_fcntl: 28 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.199378Z qemu-system-i386: qemu_lock_fcntl: 28 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.201108Z qemu-system-i386: qemu_lock_fcntl: 28 c9 1 
F_UNLCK>F_UNLCK 0 Success
2018-05-08T11:21:05.344335Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 
F_UNLCK>F_UNLCK 0 Success
2018-05-08T11:21:05.345969Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.346836Z qemu-system-i386: qemu_lock_fd_test: 27 c9 1 
F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:21:05.348937Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.359691Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.360632Z qemu-system-i386: qemu_lock_fd_test: 27 c9 1 
F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:21:05.363221Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.364781Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:05.365607Z qemu-system-i386: qemu_lock_fd_test: 27 c9 1 
F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:21:05.367794Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 
F_RDLCK>F_RDLCK 0 Success

It seems on the receiving side some code forgets to unclock offset 0xc9, which 
causes F_WRLCK to fail:
2018-05-08T11:21:52.108809Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 
F_UNLCK>F_UNLCK 0 Success
2018-05-08T11:21:52.112193Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:52.113028Z qemu-system-i386: qemu_lock_fd_test: 27 c9 1 
F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:21:52.115401Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:52.122037Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:52.122886Z qemu-system-i386: qemu_lock_fd_test: 27 c9 1 
F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:21:52.125189Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:52.126969Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:52.127801Z qemu-system-i386: qemu_lock_fd_test: 27 c9 1 
F_WRLCK>F_UNLCK 0 Success
2018-05-08T11:21:52.130109Z qemu-system-i386: qemu_lock_fcntl: 27 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:52.859199Z qemu-system-i386: qemu_lock_fcntl: 39 c9 1 
F_UNLCK>F_UNLCK 0 Success
2018-05-08T11:21:52.862010Z qemu-system-i386: qemu_lock_fcntl: 39 c9 1 
F_RDLCK>F_RDLCK 0 Success
2018-05-08T11:21:52.862673Z qemu-system-i386: qemu_lock_fd_test: 39 c9 1 
F_WRLCK>F_RDLCK 0 Success
2018-05-08T11:21:53.112935Z qemu-system-i386: qemu_lock_fd_test: 39 c9 1 
F_WRLCK>F_RDLCK 0 Success
2018-05-08T11:21:53.363246Z qemu-system-i386: qemu_lock_fd_test: 39 c9 1 
F_WRLCK>F_RDLCK 0 Success
2018-05-08T11:21:53.615668Z qemu-system-i386: qemu_lock_fcntl: 39 c9 1 
F_UNLCK>F_UNLCK 0 Success
2018-05-08T11:21:53.616426Z qemu-system-i386: qemu_lock_fcntl: 39 c9 1 
F_UNLCK>F_UNLCK 0 Success
2018-05-08T11:21:53.616816Z qemu-system-i386: qemu_lock_fcntl: 39 c9 1 
F_UNLCK>F_UNLCK 0 Success


It is unclear why that was never noticed in xen-4.10, qemu-2.9 did not have 
that bug.
Also, if a KVM or Xen guest is migrated should make zero difference for the 
qcow2 driver...


Olaf

Attachment: pgpWN6QTJ3Lby.pgp
Description: Digitale Signatur von OpenPGP

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to