On Mon, Feb 14, 2022 at 09:36:39AM +0000, Oleksandr Andrushchenko wrote:
> 
> 
> On 11.02.22 13:40, Roger Pau Monné wrote:
> > +
> >>>>        for ( i = 0; i < msix->max_entries; i++ )
> >>>>        {
> >>>>            const struct vpci_msix_entry *entry = &msix->entries[i];
> >>> Since this function is now called with the per-domain rwlock read
> >>> locked it's likely not appropriate to call process_pending_softirqs
> >>> while holding such lock (check below).
> >> You are right, as it is possible that:
> >>
> >> process_pending_softirqs -> vpci_process_pending -> read_lock
> >>
> >> Even more, vpci_process_pending may also
> >>
> >> read_unlock -> vpci_remove_device -> write_lock
> >>
> >> in its error path. So, any invocation of process_pending_softirqs
> >> must not hold d->vpci_rwlock at least.
> >>
> >> And also we need to check that pdev->vpci was not removed
> >> in between or *re-created*
> >>> We will likely need to re-iterate over the list of pdevs assigned to
> >>> the domain and assert that the pdev is still assigned to the same
> >>> domain.
> >> So, do you mean a pattern like the below should be used at all
> >> places where we need to call process_pending_softirqs?
> >>
> >> read_unlock
> >> process_pending_softirqs
> >> read_lock
> >> pdev = pci_get_pdev_by_domain(d, sbdf.seg, sbdf.bus, sbdf.devfn);
> >> if ( pdev && pdev->vpci && is_the_same_vpci(pdev->vpci) )
> >> <continue processing>
> > Something along those lines. You likely need to continue iterate using
> > for_each_pdev.
> How do we tell if pdev->vpci is the same? Jan has already brought
> this question before [1] and I was about to use some ID for that purpose:
> pdev->vpci->id = d->vpci_id++ and then we use pdev->vpci->id  for checks

Given this is a debug message I would be OK with just doing the
minimal checks to prevent Xen from crashing (ie: pdev->vpci exists)
and that the resume MSI entry is not past the current limit. Otherwise
just print a message and move on to the next device.

The recreating of pdev->vpci only occurs as a result of some admin
operations, and doing it while also trying to print the current MSI
status is not a reliable approach. So dumping an incomplete or
incoherent state as a result of ongoing admin operations would be
fine.

Thanks, Roger.

Reply via email to