Hi Roger,

Roger Pau Monné <[email protected]> writes:

> On Thu, Jul 27, 2023 at 12:56:54AM +0000, Volodymyr Babchuk wrote:
>> Hi Roger,
>> 
>> Roger Pau Monné <[email protected]> writes:
>> 
>> > On Wed, Jul 26, 2023 at 01:17:58AM +0000, Volodymyr Babchuk wrote:
>> >> 
>> >> Hi Roger,
>> >> 
>> >> Roger Pau Monné <[email protected]> writes:
>> >> 
>> >> > On Thu, Jul 20, 2023 at 12:32:31AM +0000, Volodymyr Babchuk wrote:
>> >> >> From: Oleksandr Andrushchenko <[email protected]>
>> >> >> @@ -498,6 +537,7 @@ void vpci_write(pci_sbdf_t sbdf, unsigned int reg, 
>> >> >> unsigned int size,
>> >> >>          ASSERT(data_offset < size);
>> >> >>      }
>> >> >>      spin_unlock(&pdev->vpci->lock);
>> >> >> +    unlock_locks(d);
>> >> >
>> >> > There's one issue here, some handlers will cal pcidevs_lock(), which
>> >> > will result in a lock over inversion, as in the previous patch we
>> >> > agreed that the locking order was pcidevs_lock first, d->pci_lock
>> >> > after.
>> >> >
>> >> > For example the MSI control_write() handler will call
>> >> > vpci_msi_arch_enable() which takes the pcidevs lock.  I think I will
>> >> > have to look into using a dedicated lock for MSI related handling, as
>> >> > that's the only place where I think we have this pattern of taking the
>> >> > pcidevs_lock after the d->pci_lock.
>> >> 
>> >> I'll mention this in the commit message. Is there something else that I
>> >> should do right now?
>> >
>> > Well, I don't think we want to commit this as-is with a known lock
>> > inversion.
>> >
>> > The functions that require the pcidevs lock are:
>> >
>> > pt_irq_{create,destroy}_bind()
>> > unmap_domain_pirq()
>> >
>> > AFAICT those functions require the lock in order to assert that the
>> > underlying device doesn't go away, as they do also use d->event_lock
>> > in order to get exclusive access to the data fields.  Please double
>> > check that I'm not mistaken.
>> 
>> You are right, all three function does not access any of PCI state
>> directly. However...
>> 
>> > If that's accurate you will have to check the call tree that spawns
>> > from those functions in order to modify the asserts to check for
>> > either the pcidevs or the per-domain pci_list lock being taken.
>> 
>> ... I checked calls for PT_IRQ_TYPE_MSI case, there is only call that
>> bothers me: hvm_pi_update_irte(), which calls IO-MMU code via
>> vmx_pi_update_irte():
>> 
>> amd_iommu_msi_msg_update_ire() or msi_msg_write_remap_rte().
>
> That path is only for VT-d, so strictly speaking you only need to worry
> about msi_msg_write_remap_rte().
>
> msi_msg_write_remap_rte() does take the IOMMU intremap lock.
>
> There are also existing callers of iommu_update_ire_from_msi() that
> call the functions without the pcidevs locked.  See
> hpet_msi_set_affinity() for example.

Thank you for clarifying this.

I have found another call path which causes troubles:
__pci_enable_msi[x] is called from pci_enable_msi() via vMSI, via
physdev_map_irq and also directly from ns16550 driver.

__pci_enable_msi[x] accesses pdev fields, mostly pdev->msix or
pdev->msi_list, so looks like we need pcidevs_lock(), as pdev fields are
not protected by d->pci_lock...

-- 
WBR, Volodymyr

Reply via email to