On 26/03/2024 9:13 am, Jan Beulich wrote:
> On 25.03.2024 19:18, Andrew Cooper wrote:
>> --- a/docs/misc/xen-command-line.pandoc
>> +++ b/docs/misc/xen-command-line.pandoc
>> @@ -2377,7 +2377,8 @@ By default SSBD will be mitigated at runtime (i.e 
>> `ssbd=runtime`).
>>  >              {msr-sc,rsb,verw,ibpb-entry}=<bool>|{pv,hvm}=<bool>,
>>  >              bti-thunk=retpoline|lfence|jmp, {ibrs,ibpb,ssbd,psfd,
>>  >              eager-fpu,l1d-flush,branch-harden,srb-lock,
>> ->              unpriv-mmio,gds-mit,div-scrub,lock-harden}=<bool> ]`
>> +>              unpriv-mmio,gds-mit,div-scrub,lock-harden,
>> +>              bp-spec-reduce}=<bool> ]`
>>  
>>  Controls for speculative execution sidechannel mitigations.  By default, Xen
>>  will pick the most appropriate mitigations based on compiled in support,
>> @@ -2509,6 +2510,12 @@ boolean can be used to force or prevent Xen from 
>> using speculation barriers to
>>  protect lock critical regions.  This mitigation won't be engaged by default,
>>  and needs to be explicitly enabled on the command line.
>>  
>> +On hardware supporting SRSO_MSR_FIX, the `bp-spec-reduce=` option can be 
>> used
>> +to force or prevent Xen from using MSR_BP_CFG.BP_SPEC_REDUCE to mitigate the
>> +SRSO (Speculative Return Stack Overflow) vulnerability.
> "... against HVM guests" to avoid things being left ambiguous, and to also ...
>
>>  By default, Xen will
>> +use bp-spec-reduce when available, as it preferable to using 
>> `ibpb-entry=hvm`
>> +to mitigate SRSO.
> ... correlate with the `ibpb-entry=hvm` here?

Ok.

> Maybe at the start of the paragraph also add "AMD"?

The rest of the text specifically doesn't mention vendors, in an attempt
to prevent it going stale.

Although now I re-read it, it makes the statements about microcode drops
ambiguous.

>
>> --- a/xen/arch/x86/cpu/amd.c
>> +++ b/xen/arch/x86/cpu/amd.c
>> @@ -1009,16 +1009,31 @@ static void cf_check fam17_disable_c6(void *arg)
>>      wrmsrl(MSR_AMD_CSTATE_CFG, val & mask);
>>  }
>>  
>> -static void amd_check_erratum_1485(void)
>> +static void amd_check_bp_cfg(void)
>>  {
>> -    uint64_t val, chickenbit = (1 << 5);
>> +    uint64_t val, new = 0;
>>  
>> -    if (cpu_has_hypervisor || boot_cpu_data.x86 != 0x19 || !is_zen4_uarch())
>> +    /*
>> +     * AMD Erratum #1485.  Set bit 5, as instructed.
>> +     */
>> +    if (!cpu_has_hypervisor && boot_cpu_data.x86 == 0x19 && is_zen4_uarch())
>> +            new |= (1 << 5);
>> +
>> +    /*
>> +     * On hardware supporting SRSO_MSR_FIX, we prefer BP_SPEC_REDUCE to
>> +     * IBPB-on-entry to mitigate SRSO for HVM guests.
>> +     */
>> +    if (IS_ENABLED(CONFIG_HVM) && boot_cpu_has(X86_FEATURE_SRSO_US_NO) &&
>> +            opt_bp_spec_reduce)
> Nit: Indentation is odd here (wants to be a tab followed by a few spaces).
>
>> +            new |= BP_CFG_SPEC_REDUCE;
> I take it that this goes from the assumption that it is deemed pretty unlikely
> that nowadays people would only run PV guests on a host? Otherwise, assuming
> that - like almost any such mitigation - its use costs performance, enabling
> the mitigation only as long as there are any HVM guests around might be 
> better.

I have no idea what the perf cost is.  Native systems are expected not
to use this.

However, I have no care to try and make this dynamic based on having HVM
guests active.  For one, it would need a real spinlock to avoid the MSR
race, or a stop machine context, and anyone running PV guests on this
hardware really oughtn't to be.

[Edit, See later, but we need this for PV too]

>
>> +    /* Avoid reading BP_CFG if we don't intend to change anything. */
>> +    if (!new)
>>              return;
>>  
>>      rdmsrl(MSR_AMD64_BP_CFG, val);
>>  
>> -    if (val & chickenbit)
>> +    if ((val & new) == new)
>>              return;
> Since bits may also need turning off:
>
>       if (!((val ^ new) & (BP_CFG_SPEC_REDUCE | (1 << 5))))
>               return;
>
> and the !new early-out dropped, too? Looks like this wasn't quite right
> before, either. 

That's adding unnecessary complexity.  It's unlikely that we'll ever
need to clear bits like this.

>
>> @@ -1078,22 +1082,41 @@ static void __init ibpb_calculations(void)
>>           * Confusion.  Mitigate with IBPB-on-entry.
>>           */
>>          if ( !boot_cpu_has(X86_FEATURE_BTC_NO) )
>> -            def_ibpb_entry = true;
>> +            def_ibpb_entry_pv = def_ibpb_entry_hvm = true;
>>  
>>          /*
>> -         * Further to BTC, Zen3/4 CPUs suffer from Speculative Return Stack
>> -         * Overflow in most configurations.  Mitigate with IBPB-on-entry if 
>> we
>> -         * have the microcode that makes this an effective option.
>> +         * Further to BTC, Zen3 and later CPUs suffer from Speculative 
>> Return
>> +         * Stack Overflow in most configurations.  Mitigate with 
>> IBPB-on-entry
>> +         * if we have the microcode that makes this an effective option,
>> +         * except where there are other mitigating factors available.
>>           */
> Hmm, is "Zen3 and later" really appropriate?

Yes.

SRSO isn't fixed until SRSO_NO is enumerated.

> Isn't your "speculative coding"
> remark related to perhaps both of the new bits becoming available on Zen5
> (meaning that the vulnerability would be limited to the guest/host boundary,
> as long as the MSR-based mitigation isn't used)?

Both of these are interim fixes.
>>          if ( !boot_cpu_has(X86_FEATURE_SRSO_NO) &&
>>               boot_cpu_has(X86_FEATURE_IBPB_BRTYPE) )
>> -            def_ibpb_entry = true;
>> +        {
>> +            /*
>> +             * SRSO_U/S_NO is a subset of SRSO_NO, identifying that SRSO 
>> isn't
>> +             * possible across the user/supervisor boundary.  We only need 
>> to
>> +             * use IBPB-on-entry for PV guests on hardware which doesn't
>> +             * enumerate SRSO_US_NO.
>> +             */
>> +            if ( !boot_cpu_has(X86_FEATURE_SRSO_US_NO) )
>> +                def_ibpb_entry_pv = true;
> opt_rsb_pv continues to take opt_pv32 into account, despite us having
> removed security support for 32-bit PV guests (by wording that's sadly a
> little ambiguous). Shouldn't that be done here too then, seeing that the
> flag only covers transitions from ring3?

Hmm, probably.

In practice, all these systems have CET so PV32 will be off-by-default,
but if people really want to run PV32 guests, we might as well do our best.

>
>> --- a/xen/include/public/arch-x86/cpufeatureset.h
>> +++ b/xen/include/public/arch-x86/cpufeatureset.h
>> @@ -304,7 +304,9 @@ XEN_CPUFEATURE(FSRSC,              11*32+19) /*A  Fast 
>> Short REP SCASB */
>>  XEN_CPUFEATURE(AMD_PREFETCHI,      11*32+20) /*A  PREFETCHIT{0,1} 
>> Instructions */
>>  XEN_CPUFEATURE(SBPB,               11*32+27) /*A  Selective Branch 
>> Predictor Barrier */
>>  XEN_CPUFEATURE(IBPB_BRTYPE,        11*32+28) /*A  IBPB flushes Branch Type 
>> predictions too */
>> -XEN_CPUFEATURE(SRSO_NO,            11*32+29) /*A  Hardware not vulenrable 
>> to Speculative Return Stack Overflow */
>> +XEN_CPUFEATURE(SRSO_NO,            11*32+29) /*A  Hardware not vulnerable 
>> to Speculative Return Stack Overflow */
>> +XEN_CPUFEATURE(SRSO_US_NO,         11*32+30) /*A  Hardware not vulnerable 
>> to SRSO across the User/Supervisor boundary */
> Can we validly expose this to 64-bit PV guests, where there's no CPL
> boundary? Or else isn't my "x86/PV: issue branch prediction barrier
> when switching 64-bit guest to kernel mode" needed as a prereq?

Based on uarch details, if BP-SPEC-REDUCE is active, we can advertise
SRSO_US_NO to PV guests.

But I doubt I'll be able to persuade AMD to put the safety details
behind this in writing.


I also missed the hunks adding SRSO_US_NO and SRSO_MSR_FIX to
print_details().  Fixed locally.

~Andrew

Reply via email to