On 29/02/2024 11:56 am, Roger Pau Monné wrote:
> On Thu, Feb 29, 2024 at 10:43:04AM +0000, Andrew Cooper wrote:
>> Right now, the host x2APIC setting filters into the PV max and default
>> policies, yet PV guests cannot set MSR_APIC_BASE.EXTD or access any of the
>> x2APIC MSR range.  Therefore they absolutely shouldn't see the x2APIC bit.
>>
>> Linux has workarounds for the collateral damage caused by this leakage; it
>> unconditionally filters out the x2APIC CPUID bit, and EXTD when reading
>> MSR_APIC_BASE.
>>
>> Hide the x2APIC bit in the PV default policy, but for compatibility, tolerate
>> incoming VMs which already saw the bit.  This is logic from before the
>> default/max split in Xen 4.14 which wasn't correctly adjusted at the time.
>>
>> Update the annotation from !A to !S which slightly better describes that it
>> doesn't really exist in PV guests.  HVM guests, for which x2APIC can be
>> emulated completely, already has it unconditionally set in the max policy.
>>
>> Signed-off-by: Andrew Cooper <[email protected]>
>> ---
>> CC: Jan Beulich <[email protected]>
>> CC: Roger Pau Monné <[email protected]>
>> CC: Wei Liu <[email protected]>
>>
>> This wants backporting as far as people can tollerate, but it's really not
>> obvious which commit in 4.14 should be referenced in a Fixes: tag.
> Oh, so we didn't use to expose x2APIC in Xen < 4.14 for PV at all?

We did conditionally expose x2APIC to PV guests prior to that.

What we didn't have in Xen < 4.14 was a split between max and default
policies.  Everything before 4.14 is far harder to reason about.

>
> I think this need mentioning in the commit message, as it's not clear
> whether x2APIC has always been advertised to guests.
>
> If it's indeed only Xen 4.14 that started exposing the flag, it's IMO
> less dangerous to stop exposing it.  My main concern would be OSes
> having grow some dependency on it, and us no longer exposing it
> causing collateral damage (which would be an OS bug anyway).

As I said, Linux explicitly self-hides the cap, because at one point it
tried turning x2APIC on and got unhappy at getting a #GP back.

Juergen may remember better.  IIRC it was fallout from making WRMSR not
always-silently-safe.AFAICT NetBSD doesn't explode because it's
x2APIC-enablement logic is inside #ifndef XENPV.
>> ---
>>  xen/arch/x86/cpu-policy.c                   | 19 +++++++++++++++++--
>>  xen/include/public/arch-x86/cpufeatureset.h |  2 +-
>>  2 files changed, 18 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c
>> index 10079c26ae24..a0205672428d 100644
>> --- a/xen/arch/x86/cpu-policy.c
>> +++ b/xen/arch/x86/cpu-policy.c
>> @@ -534,6 +534,14 @@ static void __init calculate_pv_max_policy(void)
>>      *p = host_cpu_policy;
>>      x86_cpu_policy_to_featureset(p, fs);
>>  
>> +    /*
>> +     * Xen at the time of writing (Feb 2024, 4.19 dev cycle) used to leak 
>> the
>> +     * host x2APIC capability into PV guests, but never supported the guest
>> +     * trying to turn x2APIC mode on.  Tolerate an incoming VM which saw the
>> +     * x2APIC CPUID bit.
>> +     */
>> +    __set_bit(X86_FEATURE_X2APIC, fs);
>> +
>>      for ( i = 0; i < ARRAY_SIZE(fs); ++i )
>>          fs[i] &= pv_max_featuremask[i];
>>  
>> @@ -566,6 +574,14 @@ static void __init calculate_pv_def_policy(void)
>>      *p = pv_max_cpu_policy;
>>      x86_cpu_policy_to_featureset(p, fs);
>>  
>> +    /*
>> +     * PV guests have never been able to use x2APIC mode, but at the time of
>> +     * writing (Feb 2024, 4.19 dev cycle), the host value used to leak into
>> +     * guests.  Hide it by default so new guests don't get mislead into
>> +     * thinking that they can use x2APIC.
>> +     */
>> +    __clear_bit(X86_FEATURE_X2APIC, fs);
> IIRC if you use the 'S' tag it won't be added to the default PV policy
> already, so there should be nothing to clear?  pv_def_featuremask
> shouldn't contain the bit in the first place.

Bah.  That means there's a bug visible in context.  Serves me right for
last minute clean-up...

I need to set the bit after applying pv_max_featuremask[].

~Andrew

Reply via email to