Jan Beulich:
> On 07.08.2023 14:55, Simon Gaiser wrote:
>> Jan Beulich:
>>> On 07.08.2023 11:38, Simon Gaiser wrote:
>>>> It seems some firmwares put dummy entries in the ACPI MADT table for non
>>>> existing processors. On my NUC11TNHi5 those have the invalid APIC ID
>>>> 0xff. Linux already has code to handle those cases both in
>>>> acpi_parse_lapic [1] as well as in acpi_parse_x2apic [2]. So add the
>>>> same check to Xen.
>>>
>>> I'm afraid it doesn't become clear to me what problem you're trying to
>>> solve.
>>
>> I want Xen to not think there are possible CPUs that actually never can
>> be there.
> 
> Did you try using "maxcpus=" on the command line? If that doesn't work
> well enough

Yes. Then Xen says, as expected "SMP: Allowing 8 CPUs (0 hotplug CPUs)",
but disabled_cpus is still 8 and nr_sockets therefore calculated as 2.

> (perhaps because of causing undesirable log messages),

I don't mind some verbose log messages.

> maybe we need some way to say "no CPU hotplug" on the command line.

That indeed might make sense, but I'm not sure this is what I want here,
see below.

>> Without ignoring those dummy entries Xen thinks my NUC has 2 sockets and
>> that there are 8 logical CPUs that are currently disabled but could be
>> hotplugged.
> 
> Yet it's exactly this which ACPI is telling us here (with some vagueness,
> which isn't easy to get around; see below).
> 
>> I'm moderately sure that soldering in another CPU is not supported, even
>> less so at runtime ;]
> 
> On your system. What about others, which are hotplug-capable?

Would those be listed with an invalid APIC ID in their entries? Then I
understand your complaint.

PS: based on your reply to Andrew, you think that this might be the case.

>>>> --- a/xen/arch/x86/acpi/boot.c
>>>> +++ b/xen/arch/x86/acpi/boot.c
>>>> @@ -87,14 +87,17 @@ acpi_parse_x2apic(struct acpi_subtable_header *header, 
>>>> const unsigned long end)
>>>>    if (BAD_MADT_ENTRY(processor, end))
>>>>            return -EINVAL;
>>>>  
>>>> +  /* Ignore entries with invalid apicid */
>>>> +  if (processor->local_apic_id == 0xffffffff)
>>>> +          return 0;
>>>> +
>>>>    /* Don't register processors that cannot be onlined. */
>>>>    if (madt_revision >= 5 &&
>>>>        !(processor->lapic_flags & ACPI_MADT_ENABLED) &&
>>>>        !(processor->lapic_flags & ACPI_MADT_ONLINE_CAPABLE))
>>>>            return 0;
>>>>  
>>>> -  if ((processor->lapic_flags & ACPI_MADT_ENABLED) ||
>>>> -      processor->local_apic_id != 0xffffffff || opt_cpu_info) {
>>>> +  if ((processor->lapic_flags & ACPI_MADT_ENABLED) || opt_cpu_info) {
>>>>            acpi_table_print_madt_entry(header);
>>>>            log = true;
>>>>    }
>>>
>>> In particular you're now suppressing log messages which may be relevant.
>>
>> I intentionally mirrored the behavior of the check directly below.
>> Unlike the the version in Linux the existing code didn't log ignored
>> entries. So I did the same for the entries with an invalid ID.
> 
> I'm afraid I can't bring in line the check you add early in the function
> with what is "directly below" [here]. I'm only inferring the "here" from
> the placement of your reply. Maybe instead you meant the rev >= 5 one,

Yes exactly. "directly below" was meant relative to the if I added.

> in which case I'm afraid the two are too different to compare: That
> rev >= 5 one tells us that the entry isn't going to be hotpluggable. The
> APIC ID check you add makes no such guarantees.

As mentioned above, if that's the case I see the problem.

> That's why the new flag was actually added in v5.

See other branch of this thread:
https://lore.kernel.org/xen-devel/80bae614-052e-0f90-cf13-0e5e4ed1a...@invisiblethingslab.com/
So we might cover at least my case regardless how invalid APIC IDs
should be interpreted.

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

Reply via email to