On 11.09.2023 19:51, Simon Gaiser wrote:
> Jan Beulich:
>> On 11.09.2023 12:12, Simon Gaiser wrote:
>>> Up to version 6.2 Errata B [2] the ACPI spec only defines
>>> ACPI_MADT_ENABLE as:
>>>
>>>     If zero, this processor is unusable, and the operating system
>>>     support will not attempt to use it.
>>>
>>> The bit that later will be ACPI_MADT_ONLINE_CAPABLE is reserved with
>>> "Must be zero".
>>>
>>> Version 6.3 [3] then adds ACPI_MADT_ONLINE_CAPABLE and changes the
>>> meaning of ACPI_MADT_ENABLE:
>>>
>>>     Enabled
>>>         If this bit is set the processor is ready for use. If this bit
>>>         is clear and the Online Capable bit is set, system hardware
>>>         supports enabling this processor during OS runtime. If this bit
>>>         is clear and the Online Capable bit is also clear, this
>>>         processor is unusable, and OSPM shall ignore the contents of the
>>>         Processor Local APIC Structure.
>>>
>>>     Online Capbable
>>>         The information conveyed by this bit depends on the value of the
>>>         Enabled bit. If the Enabled bit is set, this bit is reserved and
>>>         must be zero. Otherwise, if this this bit is set, system
>>>         hardware supports enabling this processor during OS runtime.
>>>
>>> So with conforming firmwares it should be safe to simply ignore the
>>> entry if !ACPI_MADT_ENABLED && !ACPI_MADT_ONLINE_CAPABLE
>>>
>>> As a precaution against buggy firmwares this change, like Linux [4],
>>> ignores ACPI_MADT_ONLINE_CAPABLE completely if MADT revision < 5. Note
>>> that the MADT revision was already increased to 5 with spec version 6.2
>>> Errata A [1], so before introducing the online capable flag. But it
>>> wasn't changed for the new flag, so this is the best we can do here.
>>>
>>> For previous discussion see thread [5].
>>>
>>> Link: 
>>> http://www.uefi.org/sites/default/files/resources/ACPI%206_2_A_Sept29.pdf # 
>>> [1]
>>> Link: 
>>> https://uefi.org/sites/default/files/resources/ACPI_6_2_B_final_Jan30.pdf # 
>>> [2]
>>> Link: https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf # 
>>> [3]
>>> Link: 
>>> https://git.kernel.org/torvalds/c/e2869bd7af608c343988429ceb1c2fe99644a01f 
>>> # [4]
>>> Link: 
>>> https://lore.kernel.org/xen-devel/[email protected]/
>>>  # [5]
>>> Signed-off-by: Simon Gaiser <[email protected]>
>>
>> Just to avoid misunderstandings: This change addresses a comment from
>> Andrew. I does not attempt to address my feedback on the earlier change,
>> which - as indicated - I intend to revert (timeline extended until mid
>> of the week), unless by then my earlier comments are addressed or the
>> suggested possible alternative is carried out.
>>
>> I think it goes without saying that this patch can't sensibly go in
>> until the earlier one was properly settled upon. In particular, in case
>> of reverting aiui this patch won't even apply anymore.
> 
> It textually conflicts with the other patch [6], and obviously was
> triggered by that discussion, but addresses a slightly different aspect.
> The textual conflict is trivial to resolve. I wasn't sure if it also
> conflicts with the concern you raised there, see below.
> 
> The other patch is about ignoring entries with invalid APIC IDs. As the
> discussion there showed the spec isn't very clear on that and there are
> different opinions how they should be handled. Regarding the flags the
> spec is much more concrete although given the response above I assume
> you also interpret "is unusable" of the old version of the
> ACPI_MADT_ENABLE flag as such that Xen should still allocate resources
> for those processors?
> 
> If I understood your main concern with the other patch correctly you
> have seen firmwares that later update those invalid APIC IDs with valid
> ones. Do those firmwares make use of the online capable flag? (Given
> above response probably not)

Being an older ACPI version, they don't.

> If not, then it indeed doesn't address your concern, and I see no way,
> at least by parsing MADT, how to distinguish those cases from firmwares
> that have dummy entries (the motivation for these patches).

Right. And while this _may_ be kind of acceptable when accompanied by a
downgrade of CPU hotplug support status, I haven't seen any patch to this
effect up to now. Without such a downgrade, my review comments would need
addressing, to avoid a perceived regression. Same goes for the tightening
done in the patch here: It _may_ be kind of acceptable, but only under
that same condition.

Jan

Reply via email to