On 10/01/2024 3:34 pm, Roger Pau Monne wrote:
> When Architectural Performance Monitoring is available, the PERF_GLOBAL_CTRL
> MSR contains per-counter enable bits that is ANDed with the enable bit in the
> counter EVNTSEL MSR in order for a PMC counter to be enabled.
>
> So far the watchdog code seems to have relied on the PERF_GLOBAL_CTRL enable
> bits being set by default, but at least on some Intel Sapphire and Emerald
> Rapids this is no longer the case, and Xen reports:
>
> Testing NMI watchdog on all CPUs: 0 40 stuck
>
> The first CPU on each socket is started with PERF_GLOBAL_CTRL zeroed, so PMC0
> doesn't start counting when the enable bit in EVNTSEL0 is set, due to the
> relevant enable bit in PERF_GLOBAL_CTRL not being set.
>
> Fix by detecting when Architectural Performance Monitoring is available and
> making sure the enable bit for PMC0 is set in PERF_GLOBAL_CTRL.
>
> Signed-off-by: Roger Pau Monné <[email protected]>
> ---
> The fact that it's only the first CPU on each socket that's started with
> PERF_GLOBAL_CTRL clear looks like a firmware bug to me, but in any case making
> sure PERF_GLOBAL_CTRL is properly setup should be done regardless.

It's each package-BSP, and yes, this is clearly a firmware bug.  It's
probably worth saying that we're raising it with Intel, but this bug is
out in production firmware for SPR and EMR.

> ---
>  xen/arch/x86/nmi.c | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
>
> diff --git a/xen/arch/x86/nmi.c b/xen/arch/x86/nmi.c
> index dc79c25e3ffd..7a6601c4fd31 100644
> --- a/xen/arch/x86/nmi.c
> +++ b/xen/arch/x86/nmi.c
> @@ -335,6 +335,19 @@ static void setup_p6_watchdog(unsigned counter)
>           nmi_p6_event_width > BITS_PER_LONG )
>          return;
>  
> +    if ( cpu_has_arch_perfmon )
> +    {
> +        uint64_t global_ctrl;
> +
> +        rdmsrl(MSR_CORE_PERF_GLOBAL_CTRL, global_ctrl);
> +        /*
> +         * Make sure PMC0 is enabled in global control, as the enable bit in
> +         * PERF_GLOBAL_CTRL is AND'ed with the enable bit in EVNTSEL0.
> +         */
> +        if ( !(global_ctrl & 1) )
> +            wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, global_ctrl | 1);

My gut feeling is that we ought to reinstate all bits, not just bit 1. 
If nothing else because that will make debugging using other counters
more reliable too.

vPMU (although mutually exclusive with watchdog) does context switch
this register as a whole.

See how global_ctrl_mask gets set up, although I'm not sure how much of
that infrastructure we really want to reuse here.

~Andrew

Reply via email to