On Wed, Mar 15, 2023 at 05:44:12PM -0700, Stefano Stabellini wrote:
> On Wed, 15 Mar 2023, Roger Pau Monné wrote:
> > On Sun, Mar 12, 2023 at 03:54:55PM +0800, Huang Rui wrote:
> > > From: Chen Jiqian <jiqian.c...@amd.com>
> > > 
> > > Use new xc_physdev_gsi_from_irq to get the GSI number
> > > 
> > > Signed-off-by: Chen Jiqian <jiqian.c...@amd.com>
> > > Signed-off-by: Huang Rui <ray.hu...@amd.com>
> > > ---
> > >  tools/libs/light/libxl_pci.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/tools/libs/light/libxl_pci.c b/tools/libs/light/libxl_pci.c
> > > index f4c4f17545..47cf2799bf 100644
> > > --- a/tools/libs/light/libxl_pci.c
> > > +++ b/tools/libs/light/libxl_pci.c
> > > @@ -1486,6 +1486,7 @@ static void pci_add_dm_done(libxl__egc *egc,
> > >          goto out_no_irq;
> > >      }
> > >      if ((fscanf(f, "%u", &irq) == 1) && irq) {
> > > +        irq = xc_physdev_gsi_from_irq(ctx->xch, irq);
> > 
> > This is just a shot in the dark, because I don't really have enough
> > context to understand what's going on here, but see below.
> > 
> > I've taken a look at this on my box, and it seems like on
> > dom0 the value returned by /sys/bus/pci/devices/SBDF/irq is not
> > very consistent.
> > 
> > If devices are in use by a driver the irq sysfs node reports either
> > the GSI irq or the MSI IRQ (in case a single MSI interrupt is
> > setup).
> > 
> > It seems like pciback in Linux does something to report the correct
> > value:
> > 
> > root@lcy2-dt107:~# cat /sys/bus/pci/devices/0000\:00\:14.0/irq
> > 74
> > root@lcy2-dt107:~# xl pci-assignable-add 00:14.0
> > root@lcy2-dt107:~# cat /sys/bus/pci/devices/0000\:00\:14.0/irq
> > 16
> > 
> > As you can see, making the device assignable changed the value
> > reported by the irq node to be the GSI instead of the MSI IRQ, I would
> > think you are missing something similar in the PVH setup (some pciback
> > magic)?
> > 
> > Albeit I have no idea why you would need to translate from IRQ to GSI
> > in the way you do in this and related patches, because I'm missing the
> > context.
> 
> As I mention in another email, also keep in mind that we need QEMU to
> work and QEMU calls:
> 1) xc_physdev_map_pirq (this is also called from libxl)
> 2) xc_domain_bind_pt_pci_irq

Those would be fine, and don't need any translation since it's QEMU
the one that creates and maps the MSI(-X) interrupts, so it knows the
PIRQ without requiring any translation because it has been allocated
by QEMU itself.

GSI is kind of special because it's a fixed (legacy) interrupt mapped
to an IO-APIC pin and assigned to the device by the firmware.  The
setup in that case gets done by the toolstack (libxl) because the
mapping is immutable for the lifetime of the domain.

> In this case IRQ != GSI (IRQ == 112, GSI == 28). Sysfs returns the IRQ
> in Linux (112), but actually xc_physdev_map_pirq expects the GSI, not
> the IRQ.

I think the real question here is why on this scenario IRQ != GSI for
GSI interrupts.  On one of my systems when booted as PVH dom0 with
pci=nomsi I get from /proc/interrupt:

  8:          0          0          0          0          0          0          
0   IO-APIC   8-edge      rtc0
  9:          1          0          0          0          0          0          
0   IO-APIC   9-fasteoi   acpi
 16:          0          0       8373          0          0          0          
0   IO-APIC  16-fasteoi   i801_smbus, xhci-hcd:usb1, ahci[0000:00:17.0]
 17:          0          0          0        542          0          0          
0   IO-APIC  17-fasteoi   eth0
 24:       4112          0          0          0          0          0          
0  xen-percpu    -virq      timer0
 25:        352          0          0          0          0          0          
0  xen-percpu    -ipi       resched0
 26:       6635          0          0          0          0          0          
0  xen-percpu    -ipi       callfunc0

So GSI == IRQ, and non GSI interrupts start past the last GSI, which
is 23 on this system because it has a single IO-APIC with 24 pins.

We need to figure out what causes GSIs to be mapped to IRQs != GSI on
your system, and then we can decide how to fix this.  I would expect
it could be fixed so that IRQ == GSI (like it's on PV dom0), and none
of this translation to be necessary.

Can you paste the output of /proc/interrupts on that system that has a
GSI not identity mapped to an IRQ?

> If you look at the implementation of xc_physdev_map_pirq,
> you'll the type is "MAP_PIRQ_TYPE_GSI" and also see the check in Xen
> xen/arch/x86/irq.c:allocate_and_map_gsi_pirq:
> 
>     if ( index < 0 || index >= nr_irqs_gsi )
>     {
>         dprintk(XENLOG_G_ERR, "dom%d: map invalid irq %d\n", d->domain_id,
>                 index);
>         return -EINVAL;
>     }
> 
> nr_irqs_gsi < 112, and the check will fail.
> 
> So we need to pass the GSI to xc_physdev_map_pirq. To do that, we need
> to discover the GSI number corresponding to the IRQ number.

Right, see above, I think the real problem is that IRQ != GSI on your
Linux dom0 for some reason.

Thanks, Roger.

Reply via email to