On 2/9/26 14:16, Roger Pau Monné wrote:
> On Mon, Feb 09, 2026 at 11:34:18AM +0000, Julian Vetter wrote:
>> x2APIC guests with more than 128 vCPUs have APIC IDs above 255, but MSI
>> addresses and IO-APIC RTEs only provide an 8-bit destination field.
>> Without extended destination ID support, Linux limits the maximum usable
>> APIC ID to 255, refusing to bring up vCPUs beyond that limit. So,
>> advertise XEN_HVM_CPUID_EXT_DEST_ID in the HVM hypervisor CPUID leaf,
>> signalling that guests may use MSI address bits 11:5 and IO-APIC RTE
>> bits 55:49 as additional high destination ID bits. This expands the
>> destination ID from 8 to 15 bits.
>>
>> Signed-off-by: Julian Vetter <[email protected]>
>> ---
>> xen/arch/x86/cpuid.c | 9 +++++++++
>> xen/arch/x86/hvm/irq.c | 11 ++++++++++-
>> xen/arch/x86/hvm/vioapic.c | 2 +-
>> xen/arch/x86/hvm/vmsi.c | 4 ++--
>> xen/arch/x86/include/asm/hvm/hvm.h | 4 ++--
>> xen/arch/x86/include/asm/hvm/vioapic.h | 13 +++++++++++++
>> xen/arch/x86/include/asm/msi.h | 3 +++
>> 7 files changed, 40 insertions(+), 6 deletions(-)
>>
>> diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
>> index d85be20d86..fb17c71d74 100644
>> --- a/xen/arch/x86/cpuid.c
>> +++ b/xen/arch/x86/cpuid.c
>> @@ -148,6 +148,15 @@ static void cpuid_hypervisor_leaves(const struct vcpu
>> *v, uint32_t leaf,
>> res->a |= XEN_HVM_CPUID_DOMID_PRESENT;
>> res->c = d->domain_id;
>>
>> + /*
>> + * Advertise extended destination ID support. This allows guests to
>> use
>> + * bits 11:5 of the MSI address and bits 55:49 of the IO-APIC RTE
>> for
>> + * additional destination ID bits, expanding the addressable APIC ID
>> + * range from 8 to 15 bits. This is required for x2APIC guests with
>> + * APIC IDs > 255.
>> + */
>> + res->a |= XEN_HVM_CPUID_EXT_DEST_ID;
>
> This cannot be unilaterally advertised: you need a QEMU (or in general
> any device model that manages PCI passthrough) to understand the
> extended destination mode. This requires the introduction of
> a new XEN_DOMCTL_bind_pt_irq equivalent hypercall, that can take an
> extended destination ID not limited to 256 values:
>
> struct xen_domctl_bind_pt_irq {
> [...]
> uint32_t gflags;
> #define XEN_DOMCTL_VMSI_X86_DEST_ID_MASK 0x0000ff
>
> When doing PCI passthrough it's QEMU the entity that decodes the MSI
> address and data fields, and hence would need expanding (and
> negotiation with Xen) about whether the Extended ID feature can be
> advertised.
>
> It would be good to introduce a new XEN_DMOP_* set of hypercalls that
> support Extended ID to do the PCI passthrough interrupt binding.
Thank you for your feedback. But wouldn't it be enough if QEMU extracts
the additional bits from the gflags and pass it on to XEN? In
pt_irq_create_bind I already extract the additional bits. In QEMU the
function msi_dest_id would just need to extract the additional bits
before calling xc_domain_update_msi_irq. The gflags argument in
xc_domain_update_msi_irq is 32Bits, so there is enough room to pass the
additional bits. What do you think?
Thank you
Julian
>
> Thanks, Roger.
--
Julian Vetter | Vates Hypervisor & Kernel Developer
XCP-ng & Xen Orchestra - Vates solutions
web: https://vates.tech