On Sat, 2011-07-16 at 11:15 +0200, Jan Kiszka wrote:
> On 2011-07-16 10:52, Philippe Gerum wrote:
> > On Sat, 2011-07-16 at 10:13 +0200, Jan Kiszka wrote:
> >> On 2011-07-15 15:10, Jan Kiszka wrote:
> >>> But... right now it looks like we found our primary regression:
> >>> "nucleus/shadow: shorten the uninterruptible path to secondary mode".
> >>> It opens a short windows during relax where the migrated task may be
> >>> active under both schedulers. We are currently evaluating a revert
> >>> (looks good so far), and I need to work out my theory in more
> >>> details.
> >>
> >> Looks like this commit just made a long-standing flaw in Xenomai's
> >> interrupt handling more visible: We reschedule over the interrupt stack
> >> in the Xenomai interrupt handler tails, at least on x86-64. Not sure if
> >> other archs have interrupt stacks, the point is Xenomai's design wrongly
> >> assumes there are no such things.
> >
> > Fortunately, no, this is not a design issue, no such assumption was ever
> > made, but the Xenomai core expects this to be handled on a per-arch
> > basis with the interrupt pipeline.
> 
> And that's already the problem: If Linux uses interrupt stacks, relying 
> on ipipe to disable this during Xenomai interrupt handler execution is 
> at best a workaround. A fragile one unless you increase the pre-thread 
> stack size by the size of the interrupt stack. Lacking support for a 
> generic rescheduling hook became a problem by the time Linux introduced 
> interrupt threads.

Don't assume too much. What was done for ppc64 was not meant as a
general policy. Again, this is a per-arch decision.

> 
> > As you pointed out, there is no way
> > to handle this via some generic Xenomai-only support.
> >
> > ppc64 now has separate interrupt stacks, which is why I disabled
> > IRQSTACKS which became the builtin default at some point. Blackfin goes
> > through a Xenomai-defined irq tail handler as well, because it may not
> > reschedule over nested interrupt stacks.
> 
> How does this arch prevent that xnpod_schedule in the generic interrupt 
> handler tail does its normal work?

It polls some hw status to know whether a rescheduling would be safe.
See xnarch_escalate().

> 
> > Fact is that such pending
> > problem with x86_64 was overlooked since day #1 by /me.
> >
> >>   We were lucky so far that the values
> >> saved on this shared stack were apparently "compatible", means we were
> >> overwriting them with identical or harmless values. But that's no longer
> >> true when interrupts are hitting us in the xnpod_suspend_thread path of
> >> a relaxing shadow.
> >>
> >
> > Makes sense. It would be better to find a solution that does not make
> > the relax path uninterruptible again for a significant amount of time.
> > On low end platforms we support (i.e. non-x86* mainly), this causes
> > obvious latency spots.
> 
> I agree. Conceptually, the interruptible relaxation should be safe now 
> after recent fixes.
> 
> >
> >> Likely the only possible fix is establishing a reschedule hook for
> >> Xenomai in the interrupt exit path after the original stack is restored
> >> - - just like Linux works. Requires changes to both ipipe and Xenomai
> >> unfortunately.
> >
> > __ipipe_run_irqtail() is in the I-pipe core for such purpose. If
> > instantiated properly for x86_64, and paired with xnarch_escalate() for
> > that arch as well, it could be an option for running the rescheduling
> > procedure when safe.
> 
> Nope, that doesn't work. The stack is switched later in the return path 
> in entry_64.S. We need a hook there, ideally a conditional one, 
> controlled by some per-cpu variable that is set by Xenomai on return 
> from its interrupt handlers to signal the rescheduling need.
> 

Yes, makes sense. The way to make it conditional without dragging bits
of Xenomai logic into the kernel innards is not obvious though.

It is probably time to officially introduce "exo-kernel" oriented bits
into the Linux thread info. PTDs have too lose semantics to be practical
if we want to avoid trashing the I-cache by calling probe hooks within
the dual kernel, each time we want to check some basic condition (e.g.
resched needed). A backlink to a foreign TCB there would help too.

Which leads us to killing the ad hoc kernel threads (and stacks) at some
point, which are an absolute pain.

> Jan

-- 
Philippe.



_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Reply via email to