On 16.03.2023 10:27, Roger Pau Monné wrote:
> On Thu, Mar 16, 2023 at 09:55:03AM +0100, Jan Beulich wrote:
>> On 16.03.2023 01:44, Stefano Stabellini wrote:
>>> On Wed, 15 Mar 2023, Roger Pau Monné wrote:
>>>> On Sun, Mar 12, 2023 at 03:54:55PM +0800, Huang Rui wrote:
>>>>> From: Chen Jiqian <jiqian.c...@amd.com>
>>>>>
>>>>> Use new xc_physdev_gsi_from_irq to get the GSI number
>>>>>
>>>>> Signed-off-by: Chen Jiqian <jiqian.c...@amd.com>
>>>>> Signed-off-by: Huang Rui <ray.hu...@amd.com>
>>>>> ---
>>>>>  tools/libs/light/libxl_pci.c | 1 +
>>>>>  1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
>>>>> index f4c4f17545..47cf2799bf 100644
>>>>> --- a/tools/libs/light/libxl_pci.c
>>>>> +++ b/tools/libs/light/libxl_pci.c
>>>>> @@ -1486,6 +1486,7 @@ static void pci_add_dm_done(libxl__egc *egc,
>>>>>          goto out_no_irq;
>>>>>      }
>>>>>      if ((fscanf(f, "%u", &irq) == 1) && irq) {
>>>>> +        irq = xc_physdev_gsi_from_irq(ctx->xch, irq);
>>>>
>>>> This is just a shot in the dark, because I don't really have enough
>>>> context to understand what's going on here, but see below.
>>>>
>>>> I've taken a look at this on my box, and it seems like on
>>>> dom0 the value returned by /sys/bus/pci/devices/SBDF/irq is not
>>>> very consistent.
>>>>
>>>> If devices are in use by a driver the irq sysfs node reports either
>>>> the GSI irq or the MSI IRQ (in case a single MSI interrupt is
>>>> setup).
>>>>
>>>> It seems like pciback in Linux does something to report the correct
>>>> value:
>>>>
>>>> root@lcy2-dt107:~# cat /sys/bus/pci/devices/0000\:00\:14.0/irq
>>>> 74
>>>> root@lcy2-dt107:~# xl pci-assignable-add 00:14.0
>>>> root@lcy2-dt107:~# cat /sys/bus/pci/devices/0000\:00\:14.0/irq
>>>> 16
>>>>
>>>> As you can see, making the device assignable changed the value
>>>> reported by the irq node to be the GSI instead of the MSI IRQ, I would
>>>> think you are missing something similar in the PVH setup (some pciback
>>>> magic)?
>>>>
>>>> Albeit I have no idea why you would need to translate from IRQ to GSI
>>>> in the way you do in this and related patches, because I'm missing the
>>>> context.
>>>
>>> As I mention in another email, also keep in mind that we need QEMU to
>>> work and QEMU calls:
>>> 1) xc_physdev_map_pirq (this is also called from libxl)
>>> 2) xc_domain_bind_pt_pci_irq
>>>
>>>
>>> In this case IRQ != GSI (IRQ == 112, GSI == 28). Sysfs returns the IRQ
>>> in Linux (112), but actually xc_physdev_map_pirq expects the GSI, not
>>> the IRQ. If you look at the implementation of xc_physdev_map_pirq,
>>> you'll the type is "MAP_PIRQ_TYPE_GSI" and also see the check in Xen
>>> xen/arch/x86/irq.c:allocate_and_map_gsi_pirq:
>>>
>>>     if ( index < 0 || index >= nr_irqs_gsi )
>>>     {
>>>         dprintk(XENLOG_G_ERR, "dom%d: map invalid irq %d\n", d->domain_id,
>>>                 index);
>>>         return -EINVAL;
>>>     }
>>>
>>> nr_irqs_gsi < 112, and the check will fail.
>>>
>>> So we need to pass the GSI to xc_physdev_map_pirq. To do that, we need
>>> to discover the GSI number corresponding to the IRQ number.
>>
>> That's one possible approach. Another could be (making a lot of assumptions)
>> that a PVH Dom0 would pass in the IRQ it knows for this interrupt and Xen
>> then translates that to GSI, knowing that PVH doesn't have (host) GSIs
>> exposed to it.
> 
> I don't think Xen can translate a Linux IRQ to a GSI, as that's a
> Linux abstraction Xen has no part in.

Well, I was talking about whatever Dom0 and Xen use to communicate. I.e.
if at all I might have meant pIRQ, but now that you mention ...

> The GSIs exposed to a PVH dom0 are the native (host) ones, as we
> create an emulated IO-APIC topology that mimics the physical one.
> 
> Question here is why Linux ends up with a IRQ != GSI, as it's my
> understanding on Linux GSIs will always be identity mapped to IRQs, and
> the IRQ space up to the last possible GSI is explicitly reserved for
> this purpose.

... this I guess pIRQ was a PV-only concept, and it really ought to be
GSI in the PVH case. So yes, it then all boils down to that Linux-
internal question.

Jan

Reply via email to