On Wed, 13 Sep 2023, Anthony Chan wrote:
> On Mon, 11 Sep 2023, Stefano Stabellini wrote:
> > On Mon, 11 Sep 2023, Anthony Chan wrote:
> > > On Wed, 6 Sep 2023, Stefano Stabellini wrote:
> > > > On Wed, 6 Sep 2023, Anthony Chan wrote:
> > > > > Thanks, I've tried patches that stemmed from that discussion but
> > > > > unfortunately, doesn't resolve the issue.  In fact, the
> > > > > s2idle_loop branch might not be the problem at all.  I
> > > > > experimented with Xen to allow the 'idle-states' into the FDT and
> > > > > prevented xen_guest_init on Linux from disabling the 'cpuidle'
> > > > > driver (arch/arm/xen/enlighten.c).  When I trigger a suspend, I
> > > > > can see now another thread (believe it's the idle thread) call
> > > > > into drivers/firmware/psci/psci.c:__psci_cpu_suspend and then the Xen
> > > > > counterpart at xen/arch/arm/vpsci.c:do_psci_0_2_cpu_suspend.
> > > >
> > > > OK but remember that Xen is not implementing do_psci_0_2_cpu_suspend
> > > > correctly at the moment. Either we need to fix the Xen
> > > > implementation, or we need to configure Linux so that it calls WFI 
> > > > instead
> > > > of __psci_cpu_suspend.
> > > >
> > > > As a test, can you try to apply the attached patch to Xen as a
> > > > tenative fix?  Or you could change
> > > > drivers/firmware/psci/psci.c:__psci_cpu_suspend to call WFI instead
> > > > of the PSCI operation (making sure to go to the entry_point instead of
> > > > returning).
> > >
> > > Tried the patch and substituting a WFI for a PSCI op, but Xen still 
> > > watchdogs
> > on the VMs in both cases.  I noticed the other Linux generic arm 'cpu-idle'
> > driver which used to do issue a WFI/cpu_do_idle isn't useable anymore 
> > either.
> > I'm not sure if Xen may have used to rely on this generic driver to get the 
> > WFI.
> >
> > I was running out of ideas so I went back to look at the watchdog console 
> > log:
> >
> > (XEN) do_psci_0_2_cpu_suspend
> > (XEN) Watchdog timer fired for domain 0
> > (XEN) Hardware Dom0 shutdown: watchdog rebooting machine
> >
> > Checking the code, it seems that the Xen watchdog is set by
> > xen/common/sched/core.c:SCHEDOP_watchdog, which is called by
> > tools/libs/ctrl/xc_domain.c:xc_watchdog.
> >
> > xc_watchdog is called by tools/misc/xenwatchdogd.c. Is it possible that this
> > problem is entirely caused by the daemon xenwatchdogd running in the
> > background? What happens if you kill xenwatchdogd and try again without it
> > (even better not start it at all)?
> Disabling that daemon resolved the watchdog timing out.  Never noticed that 
> daemon running before.  That clears a lot up and I think I understand what's 
> going on here now, thank you for the help.

That's great! I am glad it got resolved.

Reply via email to