Hi, all!

While working on PCI passthrough on Arm I stepped onto a crash
with the following call chain:

pci_physdev_op
   pci_add_device
       init_bars -> modify_bars -> defer_map -> raise_softirq(SCHEDULE_SOFTIRQ)
   iommu_add_device <- FAILS
   vpci_remove_device -> xfree(pdev->vpci)

Then:
leave_hypervisor_to_guest
   vpci_process_pending: v->vpci.mem != NULL; v->vpci.pdev->vpci == NULL

Which results in the crash below:

(XEN) Data Abort Trap. Syndrome=0x6
(XEN) Walking Hypervisor VA 0x10 on CPU0 via TTBR 0x00000000481dd000
(XEN) 0TH[0x0] = 0x00000000481dcf7f
(XEN) 1ST[0x0] = 0x00000000481d9f7f
(XEN) 2ND[0x0] = 0x0000000000000000
(XEN) CPU0: Unexpected Trap: Data Abort
...
(XEN) Xen call trace:
(XEN)    [<00000000002246d8>] _spin_lock+0x40/0xa4 (PC)
(XEN)    [<00000000002246c0>] _spin_lock+0x28/0xa4 (LR)
(XEN)    [<000000000024f6d0>] vpci_process_pending+0x78/0x128
(XEN)    [<000000000027f7e8>] leave_hypervisor_to_guest+0x50/0xcc
(XEN)    [<0000000000269c5c>] entry.o#guest_sync_slowpath+0xa8/0xd4

So, it seems that if pci_add_device fails and calls vpci_remove_device
the later needs to cancel any pending work.

If this is a map operation it seems to be straightforward: destroy
the range set and do not map anything.

If vpci_remove_device is called and unmap operation was scheduled
then it can be that:
- guest is being destroyed for any reason and skipping unmap is ok
   as all the mappings for the whole domain will be destroyed anyways
- guest is still going to stay alive and then unmapping must be done

I would like to hear your thought what would be the right approach
to take in order to solve the issue.

Thank you in advance,
Oleksandr

Reply via email to