On 13.03.25 16:54, Jan Beulich wrote:
On 11.03.2025 21:59, Julien Grall wrote:
On 05/03/2025 09:11, Mykola Kvach wrote:
Invocation of the CPU_UP_PREPARE notification
on ARM64 during resume causes a crash:

(XEN) [  315.807606] Error bringing CPU1 up: -16
(XEN) [  315.811926] Xen BUG at common/cpu.c:258
[...]
(XEN) [  316.142765] Xen call trace:
(XEN) [  316.146048]    [<00000a0000202264>] enable_nonboot_cpus+0x128/0x1ac 
(PC)
(XEN) [  316.153219]    [<00000a000020225c>] enable_nonboot_cpus+0x120/0x1ac 
(LR)
(XEN) [  316.160391]    [<00000a0000278180>] suspend.c#system_suspend+0x4c/0x1a0
(XEN) [  316.167476]    [<00000a0000206b70>] 
domain.c#continue_hypercall_tasklet_handler+0x54/0xd0
(XEN) [  316.176117]    [<00000a0000226538>] 
tasklet.c#do_tasklet_work+0xb8/0x100
(XEN) [  316.183288]    [<00000a0000226920>] do_tasklet+0x68/0xb0
(XEN) [  316.189077]    [<00000a000026e120>] domain.c#idle_loop+0x7c/0x194
(XEN) [  316.195644]    [<00000a0000277638>] shutdown.c#halt_this_cpu+0/0x14
(XEN) [  316.202383]    [<0000000000000008>] 0000000000000008

Freeing per-CPU areas and setting __per_cpu_offset to INVALID_PERCPU_AREA
only occur when !park_offline_cpus and system_state is not SYS_STATE_suspend.
On ARM64, park_offline_cpus is always false, so setting __per_cpu_offset to
INVALID_PERCPU_AREA depends solely on the system state.

If the system is suspended, this area is not freed, and during resume, an error
occurs in init_percpu_area, causing a crash because INVALID_PERCPU_AREA is not
set and park_offline_cpus remains 0:

      if ( __per_cpu_offset[cpu] != INVALID_PERCPU_AREA )
          return park_offline_cpus ? 0 : -EBUSY;

It appears that the same crash can occur on x86 if park_offline_cpus is set
to 0 during Xen suspend.

I am rather confused. Looking at the x86 code, it seems
park_offline_cpus is cleared for AMD platforms. So are you saying the
suspend/resume doesn't work on AMD?

Right now I can't see how it would work there. I've asked Marek for 
clarification
as to their users using S3 only on Intel hardware.

Jan

Seems as if this issue has been introduced with commit f75780d26b2f
("xen: move per-cpu area management into common code"). Before that
on x86 there was just:

    if ( __per_cpu_offset[cpu] != INVALID_PERCPU_AREA )
        return 0;

in init_percpu_area().


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to