On Wed, Aug 13, 2025 at 09:26:09AM +0200, Roger Pau Monné wrote:
> On Wed, Aug 13, 2025 at 04:53:53AM +0200, Marek Marczykowski-Górecki wrote:
> > 
> > 
> > On August 11, 2025 3:16:46 PM GMT+02:00, Andrew Cooper 
> > <andrew.coop...@citrix.com> wrote:
> > >On 07/08/2025 6:29 pm, Marek Marczykowski-Górecki wrote:
> > >> On Wed, Aug 06, 2025 at 12:46:36PM +0200, Marek Marczykowski-Górecki 
> > >> wrote:
> > >>> On Wed, Aug 06, 2025 at 12:36:56PM +0200, Jan Beulich wrote:
> > >>>> On 06.08.2025 12:23, Marek Marczykowski-Górecki wrote:
> > >>>>> We've got several reports that S3 reliability recently regressed. We
> > >>>>> identified it's definitely related to XSA-471 patches, and bisection
> > >>>>> points at "x86/idle: Remove broken MWAIT implementation". I don't have
> > >>>>> reliable reproduction steps, so I'm not 100% sure if it's really this
> > >>>>> patch, or maybe an earlier one - but it's definitely already broken at
> > >>>>> this point in the series. Most reports are about Xen 4.17 (as that's
> > >>>>> what stable Qubes OS version currently use), but I think I've seen
> > >>>>> somebody reporting the issue on 4.19 too (but I don't have clear
> > >>>>> evidence, especially if it's the same issue).
> > >>>> At the time we've been discussing the explicit raising of TIMER_SOFTIRQ
> > >>>> in mwait_idle_with_hints() a lot. If it was now truly missing, that imo
> > >>>> shouldn't cause problems only after resume, but then it may have 
> > >>>> covered
> > >>>> for some omission during resume. As a far-fetched experiment, could you
> > >>>> try putting that back (including the calculation of the "expires" local
> > >>>> variable)?
> > >>> Sure, I'll try.
> > >> It appears this fixes the issue, at least in ~10 attempts so far
> > >> (usually I could reproduce the issue after 2-3 attempts).
> > >>
> > >
> > >Can you show the exact code which seems to have made this stable?
> > 
> > This patch: 
> > https://github.com/marmarek/qubes-vmm-xen/blob/7c9e9e312948c772d9a68090109964121c1d16fe/0001-DEBUG-S3.patch
> 
> Hello,
> 
> Can you test if the patch below in isolation makes any difference?

Seems to help too. At least a test similar as before did not hit the
issue anymore.

> Thanks, Roger.
> ---
> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
> index 2ac162c997fe..27d672ad5dbb 100644
> --- a/xen/arch/x86/acpi/power.c
> +++ b/xen/arch/x86/acpi/power.c
> @@ -19,6 +19,7 @@
>  #include <xen/iommu.h>
>  #include <xen/param.h>
>  #include <xen/sched.h>
> +#include <xen/softirq.h>
>  #include <xen/spinlock.h>
>  #include <xen/watchdog.h>
>  
> @@ -310,6 +311,7 @@ static int enter_state(u32 state)
>      thaw_domains();
>      system_state = SYS_STATE_active;
>      spin_unlock(&pm_lock);
> +    raise_softirq(TIMER_SOFTIRQ);
>      return error;
>  }
>  
> diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c
> index 0fd8bdba7067..9d66db861b74 100644
> --- a/xen/arch/x86/apic.c
> +++ b/xen/arch/x86/apic.c
> @@ -65,7 +65,6 @@ static struct {
>      unsigned int apic_lvt0;
>      unsigned int apic_lvt1;
>      unsigned int apic_lvterr;
> -    unsigned int apic_tmict;
>      unsigned int apic_tdcr;
>      unsigned int apic_thmr;
>  } apic_pm_state;
> @@ -658,7 +657,6 @@ int lapic_suspend(void)
>      apic_pm_state.apic_lvt0 = apic_read(APIC_LVT0);
>      apic_pm_state.apic_lvt1 = apic_read(APIC_LVT1);
>      apic_pm_state.apic_lvterr = apic_read(APIC_LVTERR);
> -    apic_pm_state.apic_tmict = apic_read(APIC_TMICT);
>      apic_pm_state.apic_tdcr = apic_read(APIC_TDCR);
>      if (maxlvt >= 5)
>          apic_pm_state.apic_thmr = apic_read(APIC_LVTTHMR);
> @@ -718,7 +716,7 @@ int lapic_resume(void)
>          apic_write(APIC_LVTPC, apic_pm_state.apic_lvtpc);
>      apic_write(APIC_LVTT, apic_pm_state.apic_lvtt);
>      apic_write(APIC_TDCR, apic_pm_state.apic_tdcr);
> -    apic_write(APIC_TMICT, apic_pm_state.apic_tmict);
> +    apic_write(APIC_TMICT, 0);
>      apic_write(APIC_ESR, 0);
>      apic_read(APIC_ESR);
>      apic_write(APIC_LVTERR, apic_pm_state.apic_lvterr);
> 

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

Attachment: signature.asc
Description: PGP signature

Reply via email to