> Subject: Re: [PATCH 1/2] xen/arm: Add imx8q{m,x} platform glue
> 
> On Wed, 20 Mar 2024, Julien Grall wrote:
> > Hi John,
> >
> > On 20/03/2024 16:24, John Ernberg wrote:
> > > Hi Bertrand,
> > >
> > > On 3/13/24 11:07, Bertrand Marquis wrote:
> > > > Hi,
> > > >
> > > > > On 8 Mar 2024, at 15:04, Julien Grall <jul...@xen.org> wrote:
> > > > >
> > > > > Hi John,
> > > > >
> > > > > Thank you for the reply.
> > > > >
> > > > > On 08/03/2024 13:40, John Ernberg wrote:
> > > > > > On 3/7/24 00:07, Julien Grall wrote:
> > > > > > >    > Ping on the watchdog discussion bits.
> > > > > > >
> > > > > > > Sorry for the late reply.
> > > > > > >
> > > > > > > On 06/03/2024 13:13, John Ernberg wrote:
> > > > > > > > On 2/9/24 14:14, John Ernberg wrote:
> > > > > > > > >
> > > > > > > > > >       * IMX_SIP_TIMER_*:  This seems to be related to
> > > > > > > > > > the watchdog.
> > > > > > > > > > Shouldn't dom0 rely on the watchdog provided by Xen
> instead?
> > > > > > > > > > So those
> > > > > > > > > > call will be used by Xen.
> > > > > > > > >
> > > > > > > > > That is indeed a watchdog SIP, and also for setting the
> > > > > > > > > SoC internal RTC if it is being used.
> > > > > > > > >
> > > > > > > > > I looked around if there was previous discussion and
> > > > > > > > > only really found [3].
> > > > > > > > > Is the xen/xen/include/watchdog.h header meant to be for
> > > > > > > > > this kind of watchdog support or is that more for the VM
> > > > > > > > > watchdog? Looking at the x86 ACPI NMI watchdog it seems
> > > > > > > > > like the former, but I have never worked with
> > > > > > > > > x86 nor ACPI.
> > > > > > >
> > > > > > > include/watchdog.h contains helper to configure the watchdog
> > > > > > > for Xen. We also have per-VM watchdog and this is configured
> > > > > > > by the hypercall SCHEDOP_watchdog.
> > > > > > >
> > > > > > > > >
> > > > > > > > > Currently forwarding it to Dom0 has not caused any
> > > > > > > > > watchdog resets with our watchdog timeout settings, our
> > > > > > > > > specific Dom0 setup and VM count.
> > > > > > >
> > > > > > > IIUC, the SMC API for the watchdog would be similar to the
> > > > > > > ACPI NMI watchdog. So I think it would make more sense if
> > > > > > > this is not exposed to
> > > > > > > dom0 (even if Xen is doing nothing with it).
> > > > > > >
> > > > > > > Can you try to hide the SMCs and check if dom0 still behave
> > > > > > > properly?
> > > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > This SMC manages a hardware watchdog, if it's not pinged
> > > > > > within a specific interval the entire board resets.
> > > > >
> > > > > Do you know what's the default interval? Is it large enough so
> > > > > Xen +
> > > > > dom0 can boot (at least up to when the watchdog driver is 
> > > > > initialized)?
> > > > >
> > > > > > If I block the SMCs the watchdog driver in Dom0 will fail to
> > > > > > ping the watchdog, triggering a board reset because the system
> > > > > > looks to have become unresponsive. The reason this patch set
> > > > > > started is because we couldn't ping the watchdog when running with
> Xen.
> > > > > > In our specific system the bootloader enables the watchdog as
> > > > > > early as possible so that we can get watchdog protection for
> > > > > > as much of the boot as we possibly can.
> > > > > > So, if we are to block the SMC from Dom0, then Xen needs to
> > > > > > take over the pinging. It could be implemented similarly to
> > > > > > the NMI watchdog, except that the system will reset if the
> > > > > > ping is missed rather than backtrace.
> > > > > > It would also mean that Xen will get a whole watchdog
> > > > > > driver-category due to the watchdog being vendor and sometimes
> > > > > > even SoC specific when it comes to Arm.
> > > > > > My understanding of the domain watchdog code is that today the
> > > > > > domain needs to call SCHEDOP_watchdog at least once to start
> > > > > > the watchdog timer. Since watchdog protection through the
> > > > > > whole boot process is desirable we'd need some core changes,
> > > > > > such as an option to start the domain watchdog on init. > It's
> > > > > > quite a big change to make
> > > > >
> > > > > For clarification, above you seem to mention two changes:
> > > > >
> > > > > 1) Allow Xen to use the HW watchdog
> > > > > 2) Allow the domain to use the watchdog early
> > > > >
> > > > > I am assuming by big change, you are referring to 2?
> > > > >
> > > > > , while I am not against doing it if it
> > > > > > makes sense, I now wonder if Xen should manage hardware
> watchdogs.
> > > > > > Looking at the domain watchdog code it looks like if a domain
> > > > > > does not get enough execution time, the watchdog will not be
> > > > > > pinged enough and the guest will be reset. So either watchdog
> > > > > > approach requires Dom0 to get execution time. Dom0 also needs
> > > > > > to service all the PV backends it's responsible for. I'm not
> > > > > > sure it's valuable to add another layer of watchdog for this
> > > > > > scenario as the end result (checking that the entire system
> > > > > > works) is achieved without it as well.
> > > > > > So, before I try to find the time to make a proposal for
> > > > > > moving the hardware watchdog bit to Xen, do we really want it?
> > > > >
> > > > > Thanks for the details. Given that the watchdog is enabled by
> > > > > the bootloader, I think we want Xen to drive the watchdog for two
> reasons:
> > > > > 1) In true dom0less environment, dom0 would not exist
> > > > > 2) You are relying on Xen + Dom0 to boot (or at least enough to
> > > > > get the watchdog working) within the watchdog interval.
> > > >
> > > > Definitely we need to consider the case where there is no Dom0.
> > > >
> > > > I think there are in fact 3 different use cases here:
> > > > - watchdog fully driven in a domain (dom0 or another): would work
> > > > if it is expected
> > > >     that Xen + Domain boot time is under the watchdog initial
> > > > refresh rate. I think this
> > > >     could make sense on some applications where your system
> > > > depends on a specific
> > > >     domain to be properly booted to work. This would require an
> > > > initial refresh time
> > > >     configurable in the boot loader probably.
> > >
> > > This is our use-case. ^
> > >
> > > Our dom0 is monitoring and managing the other domains in our system.
> > > Without dom0 working the system isn't really working as a whole.
> > >
> > > @Julien: Would you be ok with the patch set continuing in the
> > > direction of the original proposal, letting another party (or me at
> > > a later time) implement the fully driven by Xen option?
> > I am concerned about this particular point from Bertrand:
> >
> > "would work if it is expected that Xen + Domain boot time is under the
> > watchdog initial refresh rate."
> >
> > How will a user be able to figure out how to initially configure the 
> > watchdog?
> > Is this something you can easily configure in the bootloader at runtime?
> >
> >
> > Overall, I am not for this approach at least in the current status. I
> > would be more inclined if there are some documentations explaining how
> > this is supposed to be configured on NXP, so others can use the code.

I will try to push inside NXP to release a documentation on SIP usage.
But not expect it will be released soon.
The SIP number is quite stable, and we not change the meaning.

Regards,
Peng.

> >
> > Anyway, this is why we have multiple Arm maintainers for this kind of
> > situation. If they other agrees with you, then they can ack the patch
> > and this can be merged.
> 
> 
> The approach here would not be my choice either. However, I think it would
> be nice to have better support for NXP imx8 boards in upstream Xen. To that
> end, I would ack these patches but I would ask to add a document under
> xen.git/docs/ explaining the approach, limitations, and requirements, so that
> someone else can use the code effectively.

Reply via email to