On 12/08/2025 9:41 am, Jan Beulich wrote:
> On 08.08.2025 22:23, Andrew Cooper wrote:
>> --- a/xen/arch/x86/acpi/wakeup_prot.S
>> +++ b/xen/arch/x86/acpi/wakeup_prot.S
>> @@ -63,6 +63,9 @@ LABEL(s3_resume)
>>          pushq   %rax
>>          lretq
>>  1:
>> +        /* Set up early exceptions and CET before entering C properly. */
>> +        call    ap_early_traps_init
> But this is the BSP?

By the end of the cleanup, what we have is:

At boot only:
 * really early init, basic exception handling only
 * regular init (inc syscall trampolines)
 * late re-init as we change the stack linear address

For everything else (APs, S3, hot-online):
 * early, full exception handling
 * regular init (inc syscall trampolines)


Currently, these are named:
 * bsp_early_traps_init()
 * traps_init()
 * bsp_traps_reinit()

and
 * ap_early_traps_init()
 * percpu_traps_init()


Perhaps ap_early_traps_init() should be named
percpu_early_traps_init()?  But I'm open to suggestions.

>
>> --- a/xen/arch/x86/smpboot.c
>> +++ b/xen/arch/x86/smpboot.c
>> @@ -327,12 +327,7 @@ void asmlinkage start_secondary(void)
>>      struct cpu_info *info = get_cpu_info();
>>      unsigned int cpu = smp_processor_id();
>>  
>> -    /* Critical region without IDT or TSS.  Any fault is deadly! */
>> -
>> -    set_current(idle_vcpu[cpu]);
>> -    this_cpu(curr_vcpu) = idle_vcpu[cpu];
>>      rdmsrl(MSR_EFER, this_cpu(efer));
>> -    init_shadow_spec_ctrl_state(info);
>>  
>>      /*
>>       * Just as during early bootstrap, it is convenient here to disable
>> @@ -352,14 +347,6 @@ void asmlinkage start_secondary(void)
>>       */
>>      spin_debug_disable();
>>  
>> -    get_cpu_info()->use_pv_cr3 = false;
>> -    get_cpu_info()->xen_cr3 = 0;
>> -    get_cpu_info()->pv_cr3 = 0;
>> -
>> -    load_system_tables();
>> -
>> -    /* Full exception support from here on in. */
>> -
>>      if ( cpu_has_pks )
>>          wrpkrs_and_cache(0); /* Must be before setting CR4.PKS */
>>  
>> @@ -1064,8 +1051,12 @@ static int cpu_smpboot_alloc(unsigned int cpu)
>>              goto out;
>>  
>>      info = get_cpu_info_from_stack((unsigned long)stack_base[cpu]);
>> +    memset(info, 0, sizeof(*info));
> Why do we suddenly need this? Or is this just out of an abundance of
> caution (while making the individual ->*_cr3 writes unnecessary)?

cpu_alloc_stack() explicitly uses alloc_xenheap_pages() which uses
MEMF_no_scrub.  It will usually be zeroed memory because we allocate
them all at the start of day, but it also has a habbit of being 0xc2'd
when running under Xen.

Also yes, I do dislike the ad-hoc zeroes of misc fields.

>
>> +    init_shadow_spec_ctrl_state(info);
> May I suggest to move this further down a little, at least ...
>
>>      info->processor_id = cpu;
> ... past here? Just in case other values in the struct may be needed
> in the function at some point.

Ok.

>
>>      info->per_cpu_offset = __per_cpu_offset[cpu];
>> +    info->current_vcpu = idle_vcpu[cpu];
> To be able to spot this, I think it wants /* set_current() */ or some
> such.

Ok.

>
>> +    per_cpu(curr_vcpu, cpu) = idle_vcpu[cpu];
> It's a little odd to do this early (and remotely), but it looks all fine
> with how the variable is currently used.

It did take a little while for me to conclude that it is safe, but yes -
it does relax a lot of ordering constraints for AP bringup.

~Andrew


Reply via email to