Hello Julien,
On 24.10.18 17:41, Julien Grall wrote: > vGIC is not only about GICv2. It also covers GICv3 that is accessible > via system registers. Yes, I know. But, as you state below, for GICv2 based systems those barriers are not needed. >>> rely on a context synchronising >>> operation on the CPU to ensure that the updated status register is >>> visible to the CPU when handling the interrupt. This usually happens as >>> a result of taking the IRQ exception in the first place, but there are >>> two race scenarios where this isn't the case. >>> >>> For example, let's say we have two peripherals (X and Y), where Y >>> uses a >>> system register for its interrupt status. >>> >>> Case 1: >>> 1. CPU takes an IRQ exception as a result of X raising an interrupt >>> 2. Y then raises its interrupt line, but the update to its system >>> register is not yet visible to the CPU >>> 3. The GIC decides to expose Y's interrupt number first in the Ack >>> register >>> 4. The CPU runs the IRQ handler for Y, but the status register is stale >> But this scenario somehow explains a strange thing I saw during IRQ >> latency investigation (optimization attempts) during last weeks. >> The strange sequence is as following: >> 1. CPU takes an IRQ exception from the guest context >> 2. In `enter_hypervisor_head()` function (i.e. in >> `gic_clear_lrs()` >> as for mainline) from some LR registers interrupt statuses are read as >> PENDING >> 3. Performing further code, without returning to the guest (i.e. >> inside vgic_vcpu_inject_irq()), it happens that we read status INVALID >> from the LR we read PENDING before, in step 2. > >> Please note that we are tailoring xen based on RELEASE-4.10.0. > > As you tailor Xen, I need more details on your modification to be able > to provide feedback on the oddness you encounter. I guess I should make a dedicated patch applicable to mainline to reveal the issue. I hope I'll do this nearest days. > But you are using GICv2, right? If so, you are not using system > registers for the vGIC. This is not covered by this patch. Yep, our SoC has GICv2. >>> Case 2: >>> 1. CPU takes an IRQ exception as a result of X raising an interrupt >>> 2. CPU reads the interrupt number for X from the Ack register and runs >>> its IRQ handler >>> 3. Y raises its interrupt line and the Ack register is updated, but >>> again, the update to its system register is not yet visible to the >>> CPU. >>> 4. Since the GIC drivers poll the Ack register, we read Y's interrupt >>> number and run its handler without a context synchronisation >>> operation, therefore seeing the stale register value. >>> >>> In either case, we run the risk of missing an IRQ. This patch solves >>> the >>> problem by ensuring that we execute an ISB in the GIC drivers prior >>> to invoking the interrupt handler. >>> >>> Based on Linux commit 39a06b67c2c1256bcf2361a1f67d2529f70ab206 >>> "irqchip/gic: Ensure we have an ISB between ack and ->handle_irq". >>> >>> Signed-off-by: Julien Grall <julien.gr...@arm.com> >>> >>> --- >>> This patch is a candidate for backporting up to Xen 4.9. >>> --- >>> xen/arch/arm/gic.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c >>> index 8d7e491060..305fbd66dd 100644 >>> --- a/xen/arch/arm/gic.c >>> +++ b/xen/arch/arm/gic.c >>> @@ -388,12 +388,14 @@ void gic_interrupt(struct cpu_user_regs *regs, >>> int is_fiq) >>> if ( likely(irq >= 16 && irq < 1020) ) >>> { >>> local_irq_enable(); >>> + isb(); >> Taking in account that the first GICH accesses are from >> `gic_clear_lrs()`, called from `enter_hypervisor_head`, I would suggest >> moving isb() there. Any comments here? >> >>> do_IRQ(regs, irq, is_fiq); >>> local_irq_disable(); >>> } >>> else if ( is_lpi(irq) ) >>> { >>> local_irq_enable(); >>> + isb(); >>> gic_hw_ops->do_LPI(irq); >>> local_irq_disable(); >>> } >> > > Cheers, > -- *Andrii Anisov*
_______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel