On Thu, Oct 28, 2021 at 10:04:20AM +0000, Oleksandr Andrushchenko wrote:
> Hi, all!
> 
> While working on PCI passthrough on Arm I stepped onto a crash
> with the following call chain:
> 
> pci_physdev_op
>    pci_add_device
>        init_bars -> modify_bars -> defer_map -> 
> raise_softirq(SCHEDULE_SOFTIRQ)
>    iommu_add_device <- FAILS
>    vpci_remove_device -> xfree(pdev->vpci)
> 
> Then:
> leave_hypervisor_to_guest
>    vpci_process_pending: v->vpci.mem != NULL; v->vpci.pdev->vpci == NULL
> 
> Which results in the crash below:
> 
> (XEN) Data Abort Trap. Syndrome=0x6
> (XEN) Walking Hypervisor VA 0x10 on CPU0 via TTBR 0x00000000481dd000
> (XEN) 0TH[0x0] = 0x00000000481dcf7f
> (XEN) 1ST[0x0] = 0x00000000481d9f7f
> (XEN) 2ND[0x0] = 0x0000000000000000
> (XEN) CPU0: Unexpected Trap: Data Abort
> ...
> (XEN) Xen call trace:
> (XEN)    [<00000000002246d8>] _spin_lock+0x40/0xa4 (PC)
> (XEN)    [<00000000002246c0>] _spin_lock+0x28/0xa4 (LR)
> (XEN)    [<000000000024f6d0>] vpci_process_pending+0x78/0x128
> (XEN)    [<000000000027f7e8>] leave_hypervisor_to_guest+0x50/0xcc
> (XEN)    [<0000000000269c5c>] entry.o#guest_sync_slowpath+0xa8/0xd4
> 
> So, it seems that if pci_add_device fails and calls vpci_remove_device
> the later needs to cancel any pending work.

Indeed, you will need to check that v->vpci.pdev == pdev before
canceling the pending work though, or else you could be canceling
pending work from a different device.

> If this is a map operation it seems to be straightforward: destroy
> the range set and do not map anything.
> 
> If vpci_remove_device is called and unmap operation was scheduled
> then it can be that:
> - guest is being destroyed for any reason and skipping unmap is ok
>    as all the mappings for the whole domain will be destroyed anyways
> - guest is still going to stay alive and then unmapping must be done
> 
> I would like to hear your thought what would be the right approach
> to take in order to solve the issue.

For the hardware domain it's likely better to do nothing, and just try
to continue execution. The worse that could happen is that MMIO mappings
are left in place when the device has been deassigned.

For unprivileged domains that get a failure in the middle of a vPCI
{un}map operation we need to destroy them, as we don't know in which
state the p2m is. This can only happen in vpci_process_pending for
domUs I think, as they won't be allowed to call pci_add_device. Please
see the FIXME in vpci_process_pending related to this topic.

Regards, Roger.

Reply via email to