On 2024/8/19 17:16, Jan Beulich wrote:
> On 16.08.2024 13:08, Jiqian Chen wrote:
>> The gsi of a passthrough device must be configured for it to be
>> able to be mapped into a hvm domU.
>> But When dom0 is PVH, the gsis may not get registered(see below
>> clarification), it causes the info of apic, pin and irq not be
>> added into irq_2_pin list, and the handler of irq_desc is not set,
>> then when passthrough a device, setting ioapic affinity and vector
>> will fail.
>>
>> To fix above problem, on Linux kernel side, a new code will
>> need to call PHYSDEVOP_setup_gsi for passthrough devices to
>> register gsi when dom0 is PVH.
>>
>> So, add PHYSDEVOP_setup_gsi into hvm_physdev_op for above
>> purpose.
>>
>> Clarify two questions:
>> First, why the gsi of devices belong to PVH dom0 can work?
>> Because when probe a driver to a normal device, it uses the normal
>> probe function of pci device, in its callstack, it requests irq
>> and unmask corresponding ioapic of gsi, then trap into xen and
>> register gsi finally.
>> Callstack is(on linux kernel side) pci_device_probe->
>> request_threaded_irq-> irq_startup-> __unmask_ioapic->
>> io_apic_write, then trap into xen hvmemul_do_io->
>> hvm_io_intercept-> hvm_process_io_intercept->
>> vioapic_write_indirect-> vioapic_hwdom_map_gsi-> mp_register_gsi.
>> So that the gsi can be registered.
>>
>> Second, why the gsi of passthrough device can't work when dom0
>> is PVH?
>> Because when assign a device to passthrough, it uses the specific
>> probe function of pciback, in its callstack, it doesn't install a
>> fake irq handler due to the ISR is not running. So that
>> mp_register_gsi on Xen side is never called, then the gsi is not
>> registered.
>> Callstack is(on linux kernel side) pcistub_probe->pcistub_seize->
>> pcistub_init_device-> xen_pcibk_reset_device->
>> xen_pcibk_control_isr->isr_on==0.
> 
> So: Underlying XSA-461 was the observation that the very limited set of
> cases where this fake IRQ handler is installed is an issue. The problem
> of dealing with "false" IRQs when a line-based interrupt is shared
> between devices affects all parties, not just Dom0 and not just PV
> guests. Therefore an alternative to the introduction of a new hypercall
> would be to simply leverage that the installation of such a handler
> will need widening anyway.
> 
> However, the installation of said handler presently also occurs in
> cases where it's not really needed - when the line isn't shared. Thus,
> if the handler registration would also be eliminated when it's not
> really needed, we'd be back to needing a separate hypercall.
> 
> So I think first of all it needs deciding what is going to be done in
> Linux, at least in pciback (as here we care about the Dom0 case only).
Agree, so the current options are either to use hypercall (PHYSDEVOP_setup_gsi) 
or to install fake IRQ handler in pciback.
So, we may need the inputs from the Maintainers on Linux side.
Hi Stefano and Juergen, what about your opinions?

> 
> Jan

-- 
Best regards,
Jiqian Chen.

Reply via email to