On Tue, Jan 21, 2025 at 02:25:10PM +0000, Andrew Cooper wrote:
> Logic using performance counters needs to look at
> MSR_MISC_ENABLE.PERF_AVAILABLE before touching any other resources.
> 
> When virtualised under ESX, Xen dies with a #GP fault trying to read
> MSR_CORE_PERF_GLOBAL_CTRL.
> 
> Factor this logic out into a separate function (it's already too squashed to
> the RHS), and insert a check of MSR_MISC_ENABLE.PERF_AVAILABLE.
> 
> This also limits setting X86_FEATURE_ARCH_PERFMON, although oprofile (the only
> consumer of this flag) cross-checks too.
> 
> Reported-by: Jonathan Katz <jonathan.k...@aptar.com>
> Link: https://xcp-ng.org/forum/topic/10286/nesting-xcp-ng-on-esx-8
> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
> ---
> CC: Jan Beulich <jbeul...@suse.com>
> CC: Roger Pau Monné <roger....@citrix.com>
> CC: Oleksii Kurochko <oleksii.kuroc...@gmail.com>
> 
> Untested, but this is the same pattern used by oprofile and watchdog setup.
> 
> I've intentionally stopped using Intel style.  This file is already mixed (as
> visible even in context), and it doesn't remotely resemble it's Linux origin
> any more.
> 
> For 4.20.  This regressions has already been backported.
> ---
>  xen/arch/x86/cpu/intel.c | 64 +++++++++++++++++++++++-----------------
>  1 file changed, 37 insertions(+), 27 deletions(-)
> 
> diff --git a/xen/arch/x86/cpu/intel.c b/xen/arch/x86/cpu/intel.c
> index 6a7347968ba2..586ae84d806d 100644
> --- a/xen/arch/x86/cpu/intel.c
> +++ b/xen/arch/x86/cpu/intel.c
> @@ -535,39 +535,49 @@ static void intel_log_freq(const struct cpuinfo_x86 *c)
>      printk("%u MHz\n", (factor * max_ratio + 50) / 100);
>  }
>  
> +static void init_intel_perf(struct cpuinfo_x86 *c)
> +{
> +    uint64_t val;
> +    unsigned int eax, ver, nr_cnt;
> +
> +    if ( c->cpuid_level <= 9 ||
> +         rdmsr_safe(MSR_IA32_MISC_ENABLE, val) ||
> +         !(val & MSR_IA32_MISC_ENABLE_PERF_AVAIL) )
> +        return;
> +
> +    eax = cpuid_eax(10);
> +    ver = eax & 0xff;
> +    nr_cnt = (eax >> 8) & 0xff;
> +
> +    if ( ver && nr_cnt > 1 && nr_cnt <= 32 )
> +    {
> +        unsigned int cnt_mask = (1UL << nr_cnt) - 1;
> +
> +        /*
> +         * On (some?) Sapphire/Emerald Rapids platforms each package-BSP
> +         * starts with all the enable bits for the general-purpose PMCs
> +         * cleared.  Adjust so counters can be enabled from EVNTSEL.
> +         */
> +        rdmsrl(MSR_CORE_PERF_GLOBAL_CTRL, val);
> +
> +        if ( (val & cnt_mask) != cnt_mask )
> +        {
> +            printk("FIRMWARE BUG: CPU%u invalid PERF_GLOBAL_CTRL: %#"PRIx64" 
> adjusting to %#"PRIx64"\n",
> +                   smp_processor_id(), val, val | cnt_mask);
> +            wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, val | cnt_mask);
> +        }
> +    }
> +
> +    __set_bit(X86_FEATURE_ARCH_PERFMON, c->x86_capability);

With this chunk moved back inside the if scope, and the Fixes tag
added:

Reviewed-by: Roger Pau Monné <roger....@citrix.com>

Thanks, Roger.

Reply via email to