On 18.11.2022 14:55, Roger Pau Monné wrote:
> On Fri, Nov 18, 2022 at 01:54:33PM +0100, Jan Beulich wrote:
>> On 18.11.2022 13:33, Andrew Cooper wrote:
>>> On 18/11/2022 10:31, Jan Beulich wrote:
>>>> Linux'es relatively new use of HVMOP_set_evtchn_upcall_vector has
>>>> exposed a problem with the marking of the respective vector as
>>>> pending: For quite some time Linux has been checking whether any stale
>>>> ISR or IRR bits would still be set while preparing the LAPIC for use.
>>>> This check is now triggering on the upcall vector, as the registration,
>>>> at least for APs, happens before the LAPIC is actually enabled.
>>>>
>>>> In software-disabled state an LAPIC would not accept any interrupt
>>>> requests and hence no IRR bit would newly become set while in this
>>>> state. As a result it is also wrong for us to mark the upcall vector as
>>>> having a pending request when the vLAPIC is in this state.
>>>
>>> I agree with this.
>>>
>>>> To compensate for the "enabled" check added to the assertion logic, add
>>>> logic to (conditionally) mark the upcall vector as having a request
>>>> pending at the time the LAPIC is being software-enabled by the guest.
>>>
>>> But this, I don't think is appropriate.
>>>
>>> The point of raising on enable is allegedly to work around setup race
>>> conditions.  I'm unconvinced by this reasoning, but it is what it is,
>>> and the stated behaviour is to raise there and then.
>>>
>>> If a guest enables evtchn while the LAPIC is disabled, then the
>>> interrupt is lost.  Like every other interrupt in an x86 system.
>>
>> Edge triggered ones you mean, I suppose, but yes.
>>
>>> I don't think there is any credible way a guest kernel author can expect
>>> the weird evtchn edgecase to wait for an arbitrary point in the future,
>>> and it's a corner case that I think is worth not keeping.
>>
>> Well - did you look at 7b5b8ca7dffd ("x86/upcall: inject a spurious event
>> after setting upcall vector"), referenced by the Fixes: tag? The issue is
>> that with evtchn_upcall_pending once set, there would never again be a
>> notification. So if what you say is to be the model we follow, then that
>> earlier change was perhaps wrong as well. Instead it should then have
>> been a guest change (as also implicit from your reply) to clear
>> evtchn_upcall_pending after vCPU info registration (there) or LAPIC
>> enabling (here), perhaps by way of "manually" invoking the handling of
>> that pending event, or by issuing a self-IPI with that vector.
>> Especially the LAPIC enabling case would then be yet another Xen-specific
>> on a guest code path which better wouldn't have to be aware of Xen.
> 
> Another option might be to clear evtchn_upcall_pending once the vLAPIC
> is enabled, so that further setting of evtchn_upcall_pending will
> inject the vector.  I'm worried however whether that could break
> existing users, as this would be an interface behavior change.

You mean _Xen_ clearing the flag? No, that breaks firmly documented
behavior. Xen only ever sets this field.

Jan

Reply via email to