On Tue, Oct 29, 2024 at 08:40:26AM +0100, Jan Beulich wrote:
> On 28.10.2024 18:22, Roger Pau Monné wrote:
> > On Mon, Oct 28, 2024 at 12:33:42PM +0100, Jan Beulich wrote:
> >> On 24.10.2024 17:48, Roger Pau Monne wrote:
> >>> --- a/xen/arch/x86/io_apic.c
> >>> +++ b/xen/arch/x86/io_apic.c
> >>> @@ -71,6 +71,24 @@ static int apic_pin_2_gsi_irq(int apic, int pin);
> >>>  
> >>>  static vmask_t *__read_mostly vector_map[MAX_IO_APICS];
> >>>  
> >>> +/*
> >>> + * Store the EOI handle when using interrupt remapping.
> >>> + *
> >>> + * If using AMD-Vi interrupt remapping the IO-APIC redirection entry 
> >>> remapped
> >>> + * format repurposes the vector field to store the offset into the 
> >>> Interrupt
> >>> + * Remap table.  This causes directed EOI to longer work, as the CPU 
> >>> vector no
> >>> + * longer matches the contents of the RTE vector field.  Add a 
> >>> translation
> >>> + * table so that directed EOI uses the value in the RTE vector field when
> >>> + * interrupt remapping is enabled.
> >>> + *
> >>> + * Note Intel VT-d Xen code still stores the CPU vector in the RTE 
> >>> vector field
> >>> + * when using the remapped format, but use the translation table 
> >>> uniformly in
> >>> + * order to avoid extra logic to differentiate between VT-d and AMD-Vi.
> >>> + *
> >>> + * The matrix is accessed as [#io-apic][#pin].
> >>> + */
> >>> +static uint8_t **io_apic_pin_eoi;
> >>
> >> Wasn't the conclusion from the v1 discussion that this needs to be a signed
> >> type wider than 8 bits?
> >>
> >>> @@ -298,6 +323,9 @@ static void __io_apic_eoi(unsigned int apic, unsigned 
> >>> int vector, unsigned int p
> >>>      /* Prefer the use of the EOI register if available */
> >>>      if ( ioapic_has_eoi_reg(apic) )
> >>>      {
> >>> +        if ( io_apic_pin_eoi )
> >>> +            vector = io_apic_pin_eoi[apic][pin];
> >>> +
> >>>          /* If vector is unknown, read it from the IO-APIC */
> >>>          if ( vector == IRQ_VECTOR_UNASSIGNED )
> >>>              vector = __ioapic_read_entry(apic, pin, true).vector;
> >>
> >> In addition to what Andrew said here, for this comparison the make sense
> >> ...
> >>
> >>> @@ -1022,7 +1050,20 @@ static void __init setup_IO_APIC_irqs(void)
> >>>  
> >>>      apic_printk(APIC_VERBOSE, KERN_DEBUG "init IO_APIC IRQs\n");
> >>>  
> >>> +    if ( iommu_intremap )
> >>> +    {
> >>> +        io_apic_pin_eoi = xzalloc_array(typeof(*io_apic_pin_eoi), 
> >>> nr_ioapics);
> >>> +        BUG_ON(!io_apic_pin_eoi);
> >>> +    }
> >>> +
> >>>      for (apic = 0; apic < nr_ioapics; apic++) {
> >>> +        if ( iommu_intremap )
> >>> +        {
> >>> +            io_apic_pin_eoi[apic] = 
> >>> xzalloc_array(typeof(**io_apic_pin_eoi),
> >>> +                                                  
> >>> nr_ioapic_entries[apic]);
> >>> +            BUG_ON(!io_apic_pin_eoi[apic]);
> >>> +        }
> >>
> >> ... doesn't the array also need -1 (== IRQ_VECTOR_UNASSIGNED) filling,
> >> rather than zero-filling?
> > 
> > Replying here to both you and Andrews question.  My analysis is that
> > a sentinel is not needed.  clear_IO_APIC_pin() is the only function
> > that calls the EOI routine outside of the irq_desc handlers logic.
> > 
> > It's used either by clear_IO_APIC(), which gets called before
> > io_apic_pin_eoi is allocated,
> 
> Or long after, from disable_IO_APIC().
> 
> > or by check_timer() and/or
> > unlock_ExtINT_logic() both of which will perform an
> > ioapic_write_entry() before the clear_IO_APIC_pin() call.
> 
> In unlock_ExtINT_logic() I see a call to ioapic_read_entry(), whereas the
> call to ioapic_write_entry() happens only later. In check_timer() I'm also
> uncertain a write would occur in _all_ cases. It certainly should occur,
> or else chances are low that the timer interrupt would actually work. Yet
> we surely want to avoid making hard to debug corner cases yet more subtle.

Didn't mention it here, but setup_IO_APIC_irqs() will also perform an
__ioapic_write_entry() call for almost all pins.

> > I've done some XenRT testing with a modified patch that kept the
> > io_apic_pin_eoi as unsigned int, used the sentinel as init value and
> > added an assert in __io_apic_eoi() that the value in the array was
> > never IRQ_VECTOR_UNASSIGNED when the io_apic_pin_eoi was allocated.
> > This never triggered on any hardware XenRT tested on.
> > 
> > Maybe this seems to fragile, and you both prefer to keep the sentinel
> > just in case?
> 
> Well, how certain are you that this testing in particular covered e.g. all
> the quirk cases that check_timer() tries to deal with?

That indeed I cannot guarantee, as I don't have coverage figures for
the XenRT runs.

Seeing as there's too much uncertainty about whether the array will be
initialized, I will go back to the approach of having a sentinel.
Andrew suggested to use a 16bit types, would you agree to that or do
you prefer to use unsigned int as originally proposed?

Thanks, Roger.

Reply via email to