On Tue, Nov 29, 2016 at 11:15:36AM -0800, Stefano Stabellini wrote:
> On Tue, 29 Nov 2016, Juergen Gross wrote:
> > On 29/11/16 08:34, Wei Liu wrote:
> > > On Mon, Nov 28, 2016 at 02:53:57PM +0100, Cédric Bosdonnat wrote:
> > >> Resume is sometimes silently failing for HVM guests. Getting the
> > >> xc_domain_resume() and libxl__domain_resume_device_model() in the
> > >> reverse order than what is in the suspend code fixes the problem.
> > >>
> > >> Signed-off-by: Cédric Bosdonnat <cbosdon...@suse.com>
> > >
> > > I think it would be nice to explain why reversing the order fixes the
> > > problem for you. My guess is because device model needs to be ready when
> > > the guest runs, but I'm not fully convinced by this explanation --
> > > guests should just be trapped in the hypervisor waiting for device model
> > > to come up.
> > I'm not completely sure this is true. qemu is in "stopped" state, so it
> > might be any emulation requests are just silently dropped. In any case
> > it is just weird to stop qemu in suspend case only after suspending the
> > domain, but let it continue _after_ resuming the domain. So I'd rather
> > expect an explanation (not from Cedric) why this should be okay in case
> > the patch isn't accepted.
> Calling xc_domain_resume before libxl__domain_resume_device_model seems
> wrong to me. For example in libxl_domain_unpause we call
> libxl__domain_resume_device_model, then xc_domain_unpause. We should get
> the DM ready before resuming the VM, right?
Yes, I would think so, too.
I'm inclined to accept this patch. At the end of the day, even if QEMU
doesn't drop requests now, it doesn't mean it will never drop requests
in the future.
Xen-devel mailing list