Hi Mykola,
Mykola Kvach <xakep.ama...@gmail.com> writes: > From: Mykola Kvach <mykola_kv...@epam.com> > > Add support for the PSCI SYSTEM_SUSPEND function in the vPSCI interface, > allowing guests to request suspend via the PSCI v1.0 SYSTEM_SUSPEND call > (both 32-bit and 64-bit variants). > > Implementation details: > - Add SYSTEM_SUSPEND function IDs to PSCI definitions > - Trap and handle SYSTEM_SUSPEND in vPSCI > - Allow only non-hardware domains to invoke SYSTEM_SUSPEND; return > PSCI_NOT_SUPPORTED for the hardware domain to avoid halting the system > in hwdom_shutdown() via domain_shutdown > - Require all secondary VCPUs of the calling domain to be offline before > suspend, as mandated by the PSCI specification > > The arch_domain_resume() function is an architecture-specific hook that is > invoked during domain resume to perform any necessary setup or restoration > steps required by the platform. > > The new vpsci_vcpu_up_prepare() helper is called on the resume path to set up > the vCPU context (such as entry point, some system regs and context ID) before > resuming a suspended guest. This keeps ARM/vPSCI-specific logic out of common > code and avoids intrusive changes to the generic resume flow. > > Usage: > > For Linux-based guests, suspend can be initiated with: > echo mem > /sys/power/state > or via: > systemctl suspend > > Resuming the guest is performed from control domain using: > xl resume <domain> > > Signed-off-by: Mykola Kvach <mykola_kv...@epam.com> Reviewed-by: Volodymyr Babchuk <volodymyr_babc...@epam.com> But check a small comment below > --- > Changes in V11: > - introduce arch_domain_resume() and vpsci_vcpu_up_prepare(), which are now > called on the resume path to avoid extra modifications to common code. > The vCPU context is now updated during domain resume. > > Changes in V10: > - small changes to the commit message reflect updates introduced in this > version of the patch. > - Comments are improved, clarified, and expanded, especially regarding PSCI > requirements and context handling. > - An ARM-specific helper (domain_resume_nopause_helper) > - gprintk() and PRIregister are used for logging in vPSCI code. > - An isb() is added before p2m_save_state > - The is_64bit_domain check is dropped when masking the upper part of entry > point and cid for SMC32 SYSTEM_SUSPEND PSCI calls > > Changes in V9: > - no functional changes > - cosmetic chnages after review > - enhance commit message and add extra comment to the code after review > > Changes in V8: > - GIC and virtual timer context must be saved when the domain suspends > - rework locking > - minor changes after code review > > Changes in V7: > - add proper locking > - minor changes after code review > > Changes in V6: > - skip execution of ctxt_switch_from for vcpu that is in paused domain > - add implementation of domain_resume without domain_pause > - add helper function to determine if vcpu is suspended or not > - ignore upper 32 bits of argument values when the domain is 64-bit > and calls the SMC32 SYSTEM_SUSPEND function > - cosmetic changes after review > > Changes in V5: > - don't use standby mode, restore execution in a provided by guest point > - move checking that all CPUs, except current one, are offline to after > pausing the vCPUs > - provide ret status from arch_domain_shutdown and handle it in > domain_shutdown > - adjust VPSCI_NR_FUNCS to reflect the number of newly added PSCI functions > > Changes in V4: > Dropped all changes related to watchdog, domain is marked as shutting > down in domain_shutdown and watchdog timeout handler won't trigger > because of it. > > Previous versions included code to manage Xen watchdog timers during suspend, > but this was removed. When a guest OS starts the Xen watchdog (either via the > kernel driver or xenwatchdogd), it is responsible for managing that state > across suspend/resume. On Linux, the Xen kernel driver properly stops the > watchdog during suspend. However, when xenwatchdogd is used instead, suspend > handling is incomplete, potentially leading to watchdog-triggered resets on > resume. Xen leaves watchdog handling to the guest OS and its services. > > Dropped all changes related to VCPU context, because instead domain_shutdown > is used, so we don't need any extra changes for suspending domain. > > Changes in V3: > Dropped all domain flags and related code (which touched common functions like > vcpu_unblock), keeping only the necessary changes for Xen suspend/resume, i.e. > suspend/resume is now fully supported only for the hardware domain. > Proper support for domU suspend/resume will be added in a future patch. > This patch does not yet include VCPU context reset or domain context > restoration in VCPU. > --- > xen/arch/arm/domain.c | 22 +++++ > xen/arch/arm/include/asm/domain.h | 6 ++ > xen/arch/arm/include/asm/perfc_defn.h | 1 + > xen/arch/arm/include/asm/psci.h | 2 + > xen/arch/arm/include/asm/vpsci.h | 5 +- > xen/arch/arm/vpsci.c | 114 +++++++++++++++++++++----- > xen/arch/ppc/stubs.c | 5 ++ > xen/arch/riscv/stubs.c | 5 ++ > xen/arch/x86/domain.c | 5 ++ > xen/common/domain.c | 9 ++ > xen/include/xen/domain.h | 2 + > 11 files changed, 156 insertions(+), 20 deletions(-) > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > index 863ae18157..6fd73eedde 100644 > --- a/xen/arch/arm/domain.c > +++ b/xen/arch/arm/domain.c > @@ -12,6 +12,8 @@ > #include <xen/softirq.h> > #include <xen/wait.h> > > +#include <public/sched.h> > + > #include <asm/arm64/sve.h> > #include <asm/cpuerrata.h> > #include <asm/cpufeature.h> > @@ -27,6 +29,7 @@ > #include <asm/tee/tee.h> > #include <asm/vfp.h> > #include <asm/vgic.h> > +#include <asm/vpsci.h> > #include <asm/vtimer.h> > > #include "vpci.h" > @@ -880,6 +883,25 @@ void arch_domain_creation_finished(struct domain *d) > p2m_domain_creation_finished(d); > } > > +int arch_domain_resume(struct domain *d) > +{ > + int rc; > + typeof(d->arch.resume_ctx) *ctx = &d->arch.resume_ctx; > + > + if ( !d->is_shutting_down || d->shutdown_code != SHUTDOWN_suspend ) > + { > + dprintk(XENLOG_WARNING, > + "%pd: Invalid domain state for resume: is_shutting_down=%d, > shutdown_code=%d\n", > + d, d->is_shutting_down, d->shutdown_code); > + return -EINVAL; > + } This check probably can go into common domain_resume() function, as there is nothing arch-specific in here. Probably this can be done during commit, to save us from v12? If commiters are okay with this. > + > + rc = vpsci_vcpu_up_prepare(ctx->wake_cpu , ctx->ep, ctx->cid); > + memset(ctx, 0, sizeof(*ctx)); > + > + return rc; > +} > + > static int is_guest_pv32_psr(uint32_t psr) > { > switch (psr & PSR_MODE_MASK) > diff --git a/xen/arch/arm/include/asm/domain.h > b/xen/arch/arm/include/asm/domain.h > index a3487ca713..68185fc4d6 100644 > --- a/xen/arch/arm/include/asm/domain.h > +++ b/xen/arch/arm/include/asm/domain.h > @@ -121,6 +121,12 @@ struct arch_domain > void *tee; > #endif > > + struct resume_info { > + register_t ep; > + register_t cid; > + struct vcpu *wake_cpu; > + } resume_ctx; > + > } __cacheline_aligned; > > struct arch_vcpu > diff --git a/xen/arch/arm/include/asm/perfc_defn.h > b/xen/arch/arm/include/asm/perfc_defn.h > index effd25b69e..8dfcac7e3b 100644 > --- a/xen/arch/arm/include/asm/perfc_defn.h > +++ b/xen/arch/arm/include/asm/perfc_defn.h > @@ -33,6 +33,7 @@ PERFCOUNTER(vpsci_system_reset, "vpsci: > system_reset") > PERFCOUNTER(vpsci_cpu_suspend, "vpsci: cpu_suspend") > PERFCOUNTER(vpsci_cpu_affinity_info, "vpsci: cpu_affinity_info") > PERFCOUNTER(vpsci_features, "vpsci: features") > +PERFCOUNTER(vpsci_system_suspend, "vpsci: system_suspend") > > PERFCOUNTER(vcpu_kick, "vcpu: notify other vcpu") > > diff --git a/xen/arch/arm/include/asm/psci.h b/xen/arch/arm/include/asm/psci.h > index 4780972621..48a93e6b79 100644 > --- a/xen/arch/arm/include/asm/psci.h > +++ b/xen/arch/arm/include/asm/psci.h > @@ -47,10 +47,12 @@ void call_psci_system_reset(void); > #define PSCI_0_2_FN32_SYSTEM_OFF PSCI_0_2_FN32(8) > #define PSCI_0_2_FN32_SYSTEM_RESET PSCI_0_2_FN32(9) > #define PSCI_1_0_FN32_PSCI_FEATURES PSCI_0_2_FN32(10) > +#define PSCI_1_0_FN32_SYSTEM_SUSPEND PSCI_0_2_FN32(14) > > #define PSCI_0_2_FN64_CPU_SUSPEND PSCI_0_2_FN64(1) > #define PSCI_0_2_FN64_CPU_ON PSCI_0_2_FN64(3) > #define PSCI_0_2_FN64_AFFINITY_INFO PSCI_0_2_FN64(4) > +#define PSCI_1_0_FN64_SYSTEM_SUSPEND PSCI_0_2_FN64(14) > > /* PSCI v0.2 affinity level state returned by AFFINITY_INFO */ > #define PSCI_0_2_AFFINITY_LEVEL_ON 0 > diff --git a/xen/arch/arm/include/asm/vpsci.h > b/xen/arch/arm/include/asm/vpsci.h > index 0cca5e6830..d790ab3715 100644 > --- a/xen/arch/arm/include/asm/vpsci.h > +++ b/xen/arch/arm/include/asm/vpsci.h > @@ -23,12 +23,15 @@ > #include <asm/psci.h> > > /* Number of function implemented by virtual PSCI (only 0.2 or later) */ > -#define VPSCI_NR_FUNCS 12 > +#define VPSCI_NR_FUNCS 14 > > /* Functions handle PSCI calls from the guests */ > bool do_vpsci_0_1_call(struct cpu_user_regs *regs, uint32_t fid); > bool do_vpsci_0_2_call(struct cpu_user_regs *regs, uint32_t fid); > > +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point, > + register_t context_id); > + > #endif /* __ASM_VPSCI_H__ */ > > /* > diff --git a/xen/arch/arm/vpsci.c b/xen/arch/arm/vpsci.c > index 7ba9ccd94b..50cf5fd96c 100644 > --- a/xen/arch/arm/vpsci.c > +++ b/xen/arch/arm/vpsci.c > @@ -10,30 +10,16 @@ > > #include <public/sched.h> > > -static int do_common_cpu_on(register_t target_cpu, register_t entry_point, > - register_t context_id) > +int vpsci_vcpu_up_prepare(struct vcpu *v, register_t entry_point, > + register_t context_id) > { > - struct vcpu *v; > - struct domain *d = current->domain; > - struct vcpu_guest_context *ctxt; > int rc; > + struct domain *d = current->domain; > bool is_thumb = entry_point & 1; > - register_t vcpuid; > - > - vcpuid = vaffinity_to_vcpuid(target_cpu); > - > - if ( (v = domain_vcpu(d, vcpuid)) == NULL ) > - return PSCI_INVALID_PARAMETERS; > - > - /* THUMB set is not allowed with 64-bit domain */ > - if ( is_64bit_domain(d) && is_thumb ) > - return PSCI_INVALID_ADDRESS; > - > - if ( !test_bit(_VPF_down, &v->pause_flags) ) > - return PSCI_ALREADY_ON; > + struct vcpu_guest_context *ctxt; > > if ( (ctxt = alloc_vcpu_guest_context()) == NULL ) > - return PSCI_DENIED; > + return -ENOMEM; > > vgic_clear_pending_irqs(v); > > @@ -76,8 +62,37 @@ static int do_common_cpu_on(register_t target_cpu, > register_t entry_point, > free_vcpu_guest_context(ctxt); > > if ( rc < 0 ) > + return rc; > + > + return 0; > +} > + > +static int do_common_cpu_on(register_t target_cpu, register_t entry_point, > + register_t context_id) > +{ > + struct vcpu *v; > + struct domain *d = current->domain; > + int rc; > + bool is_thumb = entry_point & 1; > + register_t vcpuid; > + > + vcpuid = vaffinity_to_vcpuid(target_cpu); > + > + if ( (v = domain_vcpu(d, vcpuid)) == NULL ) > + return PSCI_INVALID_PARAMETERS; > + > + /* THUMB set is not allowed with 64-bit domain */ > + if ( is_64bit_domain(d) && is_thumb ) > + return PSCI_INVALID_ADDRESS; > + > + if ( !test_bit(_VPF_down, &v->pause_flags) ) > + return PSCI_ALREADY_ON; > + > + rc = vpsci_vcpu_up_prepare(v, entry_point, context_id); > + if ( rc ) > return PSCI_DENIED; > > + vgic_clear_pending_irqs(v); > vcpu_wake(v); > > return PSCI_SUCCESS; > @@ -197,6 +212,48 @@ static void do_psci_0_2_system_reset(void) > domain_shutdown(d,SHUTDOWN_reboot); > } > > +static int32_t do_psci_1_0_system_suspend(register_t epoint, register_t cid) > +{ > + int32_t rc; > + struct vcpu *v; > + struct domain *d = current->domain; > + bool is_thumb = epoint & 1; > + > + /* THUMB set is not allowed with 64-bit domain */ > + if ( is_64bit_domain(d) && is_thumb ) > + return PSCI_INVALID_ADDRESS; > + > + /* SYSTEM_SUSPEND is not supported for the hardware domain yet */ > + if ( is_hardware_domain(d) ) > + return PSCI_NOT_SUPPORTED; > + > + /* Ensure that all CPUs other than the calling one are offline */ > + domain_lock(d); > + for_each_vcpu ( d, v ) > + { > + if ( v != current && is_vcpu_online(v) ) > + { > + domain_unlock(d); > + return PSCI_DENIED; > + } > + } > + domain_unlock(d); > + > + rc = domain_shutdown(d, SHUTDOWN_suspend); > + if ( rc ) > + return PSCI_DENIED; > + > + d->arch.resume_ctx.ep = epoint; > + d->arch.resume_ctx.cid = cid; > + d->arch.resume_ctx.wake_cpu = current; > + > + gprintk(XENLOG_DEBUG, > + "SYSTEM_SUSPEND requested, epoint=0x%"PRIregister", > cid=0x%"PRIregister, > + epoint, cid); > + > + return rc; > +} > + > static int32_t do_psci_1_0_features(uint32_t psci_func_id) > { > /* /!\ Ordered by function ID and not name */ > @@ -214,6 +271,8 @@ static int32_t do_psci_1_0_features(uint32_t psci_func_id) > case PSCI_0_2_FN32_SYSTEM_OFF: > case PSCI_0_2_FN32_SYSTEM_RESET: > case PSCI_1_0_FN32_PSCI_FEATURES: > + case PSCI_1_0_FN32_SYSTEM_SUSPEND: > + case PSCI_1_0_FN64_SYSTEM_SUSPEND: > case ARM_SMCCC_VERSION_FID: > return 0; > default: > @@ -344,6 +403,23 @@ bool do_vpsci_0_2_call(struct cpu_user_regs *regs, > uint32_t fid) > return true; > } > > + case PSCI_1_0_FN32_SYSTEM_SUSPEND: > + case PSCI_1_0_FN64_SYSTEM_SUSPEND: > + { > + register_t epoint = PSCI_ARG(regs, 1); > + register_t cid = PSCI_ARG(regs, 2); > + > + if ( fid == PSCI_1_0_FN32_SYSTEM_SUSPEND ) > + { > + epoint &= GENMASK(31, 0); > + cid &= GENMASK(31, 0); > + } > + > + perfc_incr(vpsci_system_suspend); > + PSCI_SET_RESULT(regs, do_psci_1_0_system_suspend(epoint, cid)); > + return true; > + } > + > default: > return false; > } > diff --git a/xen/arch/ppc/stubs.c b/xen/arch/ppc/stubs.c > index bdaf474c5c..0db0627b5c 100644 > --- a/xen/arch/ppc/stubs.c > +++ b/xen/arch/ppc/stubs.c > @@ -224,6 +224,11 @@ void arch_domain_creation_finished(struct domain *d) > BUG_ON("unimplemented"); > } > > +int arch_domain_resume(struct domain *d) > +{ > + return 0; > +} > + > int arch_set_info_guest(struct vcpu *v, vcpu_guest_context_u c) > { > BUG_ON("unimplemented"); > diff --git a/xen/arch/riscv/stubs.c b/xen/arch/riscv/stubs.c > index 1a8c86cd8d..52532ae14d 100644 > --- a/xen/arch/riscv/stubs.c > +++ b/xen/arch/riscv/stubs.c > @@ -198,6 +198,11 @@ void arch_domain_creation_finished(struct domain *d) > BUG_ON("unimplemented"); > } > > +int arch_domain_resume(struct domain *d) > +{ > + return 0; > +} > + > int arch_set_info_guest(struct vcpu *v, vcpu_guest_context_u c) > { > BUG_ON("unimplemented"); > diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c > index 19fd86ce88..94a06bc697 100644 > --- a/xen/arch/x86/domain.c > +++ b/xen/arch/x86/domain.c > @@ -1138,6 +1138,11 @@ void arch_domain_creation_finished(struct domain *d) > hvm_domain_creation_finished(d); > } > > +int arch_domain_resume(struct domain *d) > +{ > + return 0; > +} > + > #ifdef CONFIG_COMPAT > #define xen_vcpu_guest_context vcpu_guest_context > #define fpu_ctxt fpu_ctxt.x > diff --git a/xen/common/domain.c b/xen/common/domain.c > index 104e917f07..d73a88ced5 100644 > --- a/xen/common/domain.c > +++ b/xen/common/domain.c > @@ -1352,6 +1352,7 @@ int domain_shutdown(struct domain *d, u8 reason) > void domain_resume(struct domain *d) > { > struct vcpu *v; > + int rc; > > /* > * Some code paths assume that shutdown status does not get reset under > @@ -1359,6 +1360,14 @@ void domain_resume(struct domain *d) > */ > domain_pause(d); > > + rc = arch_domain_resume(d); > + if ( rc ) > + { > + domain_unpause(d); > + printk("%pd: Failed to resume domain (ret %d)\n", d, rc); > + return; > + } > + > spin_lock(&d->shutdown_lock); > > d->is_shutting_down = d->is_shut_down = 0; > diff --git a/xen/include/xen/domain.h b/xen/include/xen/domain.h > index e10baf2615..5f77ffadf1 100644 > --- a/xen/include/xen/domain.h > +++ b/xen/include/xen/domain.h > @@ -109,6 +109,8 @@ int arch_domain_soft_reset(struct domain *d); > > void arch_domain_creation_finished(struct domain *d); > > +int arch_domain_resume(struct domain *d); > + > void arch_p2m_set_access_required(struct domain *d, bool access_required); > > int arch_set_info_guest(struct vcpu *v, vcpu_guest_context_u c); -- WBR, Volodymyr