On Tue, Feb 7, 2017 at 9:53 AM, Razvan Cojocaru <rcojoc...@bitdefender.com>
wrote:

> Hello,
>
> Setting, e.g. 16 VCPUs for a HVM guest, ends up blocking the guest
> completely when subscribing to vm_events, apparently because of this
> code in xen/common/vm_event.c:
>
> 315     /* Give this vCPU a black eye if necessary, on the way out.
> 316      * See the comments above wake_blocked() for more information
> 317      * on how this mechanism works to avoid waiting. */
> 318     avail_req = vm_event_ring_available(ved);
> 319     if( current->domain == d && avail_req < d->max_vcpus )
> 320         vm_event_mark_and_pause(current, ved);
>
> It would appear that even if the guest only has 2 online VCPUs, the
> "avail_req < d->max_vcpus" condition will pause current, and we
> eventually end up with all the VCPUs paused.
>
> An ugly hack ("avail_req < 2") has allowed booting a guest with many
> VCPUs (max_vcpus, the guest only brings 2 VCPUs online), however that's
> just to prove that that was the culprit - a real solution to this needs
> more in-depth understading of the issue and potential solution. That's
> basically very old code (pre-2012 at least) that got moved around into
> the current shape of Xen today - please CC anyone relevant to the
> discussion that you're aware of.
>
> Thoughts?
>

I think is a side-effect of the growth of the vm_event structure and the
fact that we have a single page ring. The check effectively sets a
threshold of having enough space for each vCPU to place at least one more
event on the ring, and if that's not the case it gets paused. OTOH I think
this would only have an effect on asynchronous events, for all other events
the vCPU is already paused. Is that the case you have?

Tamas
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Reply via email to