Jan Kiszka <[email protected]> writes:
> On 07.06.21 17:04, Philippe Gerum wrote: >> >> Jan Kiszka <[email protected]> writes: >> >>> On 07.06.21 15:48, Jan Kiszka wrote: >>>> On 07.06.21 15:44, Jan Kiszka wrote: >>>>> On 07.06.21 15:03, Philippe Gerum wrote: >>>>>> >>>>>> Jan Kiszka <[email protected]> writes: >>>>>> >>>>>>> On 07.06.21 09:17, Philippe Gerum wrote: >>>>>>>> >>>>>>>> Jan Kiszka <[email protected]> writes: >>>>>>>> >>>>>>>>> On 06.06.21 16:43, Philippe Gerum via Xenomai wrote: >>>>>>>>>> >>>>>>>>>> Jan Kiszka <[email protected]> writes: >>>>>>>>>> >>>>>>>>>>> From: Jan Kiszka <[email protected]> >>>>>>>>>>> >>>>>>>>>>> The correct order is hard IRQs off first, then stalling inband, see >>>>>>>>>>> also >>>>>>>>>>> I-pipe. Otherwise, we risk to take an interrupt for the inband >>>>>>>>>>> stage but >>>>>>>>>>> do not inject it before returning to user space. >>>>>>>>>>> >>>>>>>>>>> Signed-off-by: Jan Kiszka <[email protected]> >>>>>>>>>>> --- >>>>>>>>>>> >>>>>>>>>>> Fixes the RCU stalls Florian reported, at least for me. >>>>>>>>>>> >>>>>>>>>>> I'm afraid this wasn't the last regression of translating I-pipe >>>>>>>>>>> into >>>>>>>>>>> dovetail.. >>>>>>>>>>> >>>>>>>>>>> include/linux/entry-common.h | 3 ++- >>>>>>>>>>> 1 file changed, 2 insertions(+), 1 deletion(-) >>>>>>>>>>> >>>>>>>>>>> diff --git a/include/linux/entry-common.h >>>>>>>>>>> b/include/linux/entry-common.h >>>>>>>>>>> index 0fb45b2d6094..00540110985e 100644 >>>>>>>>>>> --- a/include/linux/entry-common.h >>>>>>>>>>> +++ b/include/linux/entry-common.h >>>>>>>>>>> @@ -203,7 +203,8 @@ static inline void >>>>>>>>>>> local_irq_disable_exit_to_user(void); >>>>>>>>>>> #ifndef local_irq_disable_exit_to_user >>>>>>>>>>> static inline void local_irq_disable_exit_to_user(void) >>>>>>>>>>> { >>>>>>>>>>> - local_irq_disable_full(); >>>>>>>>>>> + hard_cond_local_irq_disable(); >>>>>>>>>>> + local_irq_disable(); >>>>>>>>>>> } >>>>>>>>>>> #endif >>>>>>>>>> >>>>>>>>>> Good spot, real issue here, just like the CPU_IDLE code deals with. >>>>>>>>>> The >>>>>>>>>> fix looks good, but maybe we could do even better. >>>>>>>>>> >>>>>>>>>> The way local_irq_disable_full() works tends to introduce this issue >>>>>>>>>> in >>>>>>>>>> a sneaky way, when the code path does not synchronize the interrupt >>>>>>>>>> log >>>>>>>>>> (exit_to_user_mode_loop may be at fault in this case, which does not >>>>>>>>>> traverses synchronize_pipeline()). Lack of synchronization would >>>>>>>>>> still >>>>>>>>>> hit us if some trap handler turns hard irqs off -> on -> off the same >>>>>>>>>> way, and we don't leave through the user exit path. >>>>>>>>>> >>>>>>>>>> Inverting the order of the actions in local_irq_disable_full() may >>>>>>>>>> be an >>>>>>>>>> option (inband_irq_disable would allow that), making sure we cannot >>>>>>>>>> be >>>>>>>>>> caught in the same issue when returning to kernel mode is another >>>>>>>>>> way. This needs more thought I think. >>>>>>>>>> >>>>>>>>> >>>>>>>>> So, always this way? >>>>>>>>> >>>>>>>>> local_irq_disable_full: >>>>>>>>> hard_local_irq_disable >>>>>>>>> local_irq_disable >>>>>>>>> >>>>>>>>> local_irq_enable_full: >>>>>>>>> hard_local_irq_enable >>>>>>>>> local_irq_enable >>>>>>>>> >>>>>>>> >>>>>>>> Yes. >>>>>>>> >>>>>>>>> Or even flip local_irq_enable_full as well? >>>>>>>> >>>>>>>> Not this one, hard irqs must be on before local_irq_enable() is issued >>>>>>>> otherwise we would trigger a debug assertion. The reason for such >>>>>>>> assertion is that the I-pipe version of local_irq_enable() force >>>>>>>> enables >>>>>>>> hard irqs on, and this is something we want to depart from because >>>>>>>> whether the caller expects or not such side-effect is error-prone. For >>>>>>>> this reason, inband_irq_enable() expects hard irqs on, which should be >>>>>>>> the current hard irq state for the caller under normal circumstances. >>>>>>>> >>>>>>>>> >>>>>>>>> Was there ever code that required ordering local_irq_disable_full the >>>>>>>>> other way around? >>>>>>>>> >>>>>>>> >>>>>>>> After review, I don't think so. The current order for >>>>>>>> local_irq_disable_full() was rather determined by applying a reverse >>>>>>>> sequence compared to local_irq_enable_full(). So this looks good. I >>>>>>>> need >>>>>>>> to test this on the armv[7/8] side here first before merging. >>>>>>>> >>>>>>> >>>>>>> Is there a commit somewhere that I can drop already? >>>>>>> >>>>>> >>>>>> 1. These are fine with me in their latest iteration: >>>>>> >>>>>> irq_pipeline: Clean up stage_info field and users >>>>>> irq_pipeline: Account for stage migration across faults >>>>>> irq_pipeline: Warn when calling irqentry_enter with oob stalled >>>>>> x86: dovetail: Fix TS flag reservation >>>>>> >>>>>> 2. This one should be replaced by a fix in local_irq_disable_full(), >>>>>> pending some tests ongoing here: >>>>>> >>>>>> irq_pipeline: Prevent returning to user space with pending inband IRQs >>>>>> >>>>>> However, as I mentioned earlier, this may not be sufficient per se, we >>>>>> may have to synchronize the in-band log when unstalling the stage on the >>>>>> kernel exit path, i.e. the exit point fixed up by the "Account for stage >>>>>> migration across faults" patch. This would address the following >>>>>> scenario: >>>>>> >>>>>> irqentry_enter on kernel context: >>>>>> stall_inband(); >>>>>> trap_handler(); >>>>>> hard_local_irq_enable() >>>>>> hard_local_irq_disable() >>>>>> if (state.stage_info == IRQENTRY_INBAND_UNSTALLED >>>>>> || >>>>>> state.stage_info == IRQENTRY_OOB) { >>>>>> unstall_inband(); >>>>>> synchronize_pipeline(); >>>>> >>>>> Oh, unstall_inband does not synchronize? Because of >>>>> hard_local_irq_disable or in general? >>>>> >>>> >>>> Ah, found it - unstall != inband_irq_enable. Why not using the latter >>>> directly? >>>> >>> >>> ...because it's instrumented to detect hard_irqs_disabled. >>> >>> Can we safely sync at that point? >> >> Yes we may do this, because we restore the original -unstalled- state of >> the in-band stage. So if we have pending IRQs, we may play them at this >> point. What we may not do obviously, is creating synchronization points >> where the kernel does not expect irqs. >> > > Again: When will any reschedules that might be requested by those > replayed IRQs be handled then? > Nowhere unless we provide a log synchronization routine dedicated to the in-band kernel exit path, which would fold in the rescheduling call via irqentry_exit_cond_resched(). I'll write that one. -- Philippe.
