On 04.07.2023 23:03, Volodymyr Babchuk wrote:
> I am currently implementing your proposal (along with Jan's
> suggestions), but I am facing ABBA deadlock with IOMMU's
> reassign_device() call, which has this piece of code:
> 
>         list_move(&pdev->domain_list, &target->pdev_list);
> 
> My immediate change was:
> 
>         write_lock(&pdev->domain->pci_lock);
>         list_del(&pdev->domain_list);
>         write_unlock(&pdev->domain->pci_lock);
> 
>         write_lock(&target->pci_lock);
>         list_add(&pdev->domain_list, &target->pdev_list);
>         write_unlock(&target->pci_lock);
> 
> But this will not work because reassign_device is called from
> pci_release_devices() which iterates over d->pdev_list, so we need to
> take a d->pci_lock early.
> 
> Any suggestions on how to fix this? My idea is to remove a device from a
> list one at time:
> 
> int pci_release_devices(struct domain *d)
> {
>     struct pci_dev *pdev;
>     u8 bus, devfn;
>     int ret;
> 
>     pcidevs_lock();
>     write_lock(&d->pci_lock);
>     ret = arch_pci_clean_pirqs(d);
>     if ( ret )
>     {
>         pcidevs_unlock();
>         write_unlock(&d->pci_lock);
>         return ret;
>     }
> 
>     while ( !list_empty(&d->pdev_list) )
>     {
>         pdev = list_entry(&d->pdev_list, struct pci_dev, domain_list);
>         bus = pdev->bus;
>         devfn = pdev->devfn;
>         list_del(&pdev->domain_list);
>         write_unlock(&d->pci_lock);
>         ret = deassign_device(d, pdev->seg, bus, devfn) ?: ret;
>         write_lock(&d->pci_lock);

I think it needs doing almost like this, but with two more tweaks and
no list_del() right here (first and foremost to avoid needing to
figure whether removing early isn't going to subtly break anything;
see below for an error case that would end up with changed behavior):

    while ( !list_empty(&d->pdev_list) )
    {
        const struct pci_dev *pdev = list_first_entry(&d->pdev_list, struct 
pci_dev, domain_list);
        uint16_t seg = pdev->seg;
        uint8_t bus = pdev->bus;
        uint8_t devfn = pdev->devfn;

        write_unlock(&d->pci_lock);
        ret = deassign_device(d, seg, bus, devfn) ?: ret;
        write_lock(&d->pci_lock);
    }

One caveat though: The original list_for_each_entry_safe() guarantees
the loop to complete; your use of list_del() would guarantee that too,
but then the device wouldn't be on the list anymore if deassign_device()
failed. Therefore I guess you will need another local list where you
(temporarily) put all the devices which deassign_device() left on the
list, and which you would then move back to d->pdev_list after the loop
has finished. (Whether it is sufficient to inspect the list head to
determine whether the pdev is still on the list needs careful checking.)

Jan

Reply via email to