On 18/08/2025 10:35 am, Jan Beulich wrote:
> On 08.08.2025 22:23, Andrew Cooper wrote:
>> With the shadow stack and exception handling adjustements in place, we can 
>> now
>> activate FRED when appropriate.  Note that opt_fred is still disabled by
>> default.
>>
>> Introduce init_fred() to set up all the MSRs relevant for FRED.  FRED uses
>> MSR_STAR (entries from Ring3 only), and MSR_FRED_SSP_SL0 aliases MSR_PL0_SSP
>> when CET-SS is active.  Otherwise, they're all new MSRs.
>>
>> With init_fred() existing, load_system_tables() and legacy_syscall_init()
>> should only be used when setting up IDT delivery.  Insert ASSERT()s to this
>> effect, and adjust the various *_init() functions to make this property true.
>>
>> Per the documentation, ap_early_traps_init() is responsible for switching off
>> the boot GDT, which needs doing even in FRED mode.
>>
>> Finally, set CR4.FRED in {bsp,ap}_early_traps_init().
> Probably you've done that already, but these last two paragraphs will need
> updating following patch 08 v1.1.

It's on my list, but not done yet.

>
>> Xen can now boot in FRED mode up until starting a PV guest, where it faults
>> because IRET is not permitted to change privilege.
>>
>> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
> Reviewed-by: Jan Beulich <jbeul...@suse.com>

Thanks, but I fear this patch has changed too much.  I'll take a
decision when I've cleaned up the integration of the PV work.

>
>> @@ -274,6 +279,44 @@ static void __init init_ler(void)
>>      setup_force_cpu_cap(X86_FEATURE_XEN_LBR);
>>  }
>>  
>> +/*
>> + * Set up all MSRs relevant for FRED event delivery.
>> + *
>> + * Xen does not use any of the optional config in MSR_FRED_CONFIG, so all 
>> that
>> + * is needed is the entrypoint.
>> + *
>> + * Because FRED always provides a good stack, NMI and #DB do not need any
>> + * special treatment.  Only #DF needs another stack level, and #MC for the
>> + * offchance that Xen's main stack suffers an uncorrectable error.
>> + *
>> + * FRED reuses MSR_STAR to provide the segment selector values to load on
>> + * entry from Ring3.  Entry from Ring0 leave %cs and %ss unmodified.
>> + */
>> +static void init_fred(void)
>> +{
>> +    unsigned long stack_top = get_stack_bottom() & ~(STACK_SIZE - 1);
>> +
>> +    ASSERT(opt_fred == 1);
>> +
>> +    wrmsrns(MSR_STAR, XEN_MSR_STAR);
>> +    wrmsrns(MSR_FRED_CONFIG, (unsigned long)entry_FRED_R3);
>> +
>> +    wrmsrns(MSR_FRED_RSP_SL0, (unsigned long)(&get_cpu_info()->_fred + 1));
>> +    wrmsrns(MSR_FRED_RSP_SL1, 0);
> In the event of a bug somewhere causing this slot to be accessed, is the
> wrapping behavior well-defined, resulting in an attempt to write to the
> top end of VA space? (Then again, if the wrapping itself caused a fault,
> the overall effect would be largely the same - in many cases #DF.)

The wrapping is well defined - like other cases, it goes to the top of
address space, but that's owned by PV guests.  SMAP ought to mitigate
what would otherwise be a priv-esc.

With IDT, we poisoned the unused pointers with non-canonical addresses,
but that's not possible here, as they're MSRs and checked at this point,
rather than when they're used.

I suspect the best we can do is reuse the #DB or NMI stacks, and
intentionally reverse the regular and shadow stack pointers, meaning
that any attempt to use SL1 will hit a guard page and escalate to #DF.

~Andrew

Reply via email to