Dmitry Adamushko wrote: > On 23/01/06, Gilles Chanteperdrix <[EMAIL PROTECTED]> wrote: >> Jeroen Van den Keybus wrote: >>> Hello, > > > >> [ skip-skip-skip ] >> > > >> Since in xnshadow_harden, the running thread marks itself as suspended >> before running wake_up_interruptible_sync, the gatekeeper will run when >> schedule() get called, which in turn, depend on the CONFIG_PREEMPT* >> configuration. In the non-preempt case, the current thread will be >> suspended and the gatekeeper will run when schedule() is explicitely >> called in xnshadow_harden(). In the preempt case, schedule gets called >> when the outermost spinlock is unlocked in wake_up_interruptible_sync(). > > > In fact, no. > > wake_up_interruptible_sync() doesn't set the need_resched "flag" up. That's > why it's "sync" actually. > > Only if the need_resched was already set before calling > wake_up_interruptible_sync(), then yes. > > The secuence is as follows : > > wake_up_interruptible_sync ---> wake_up_sync ---> wake_up_common(..., > sync=1, ...) ---> ... ---> try_to_wake_up(..., sync=1) > > Look at the end of try_to_wake_up() to see when it calls resched_task(). > The comment there speaks for itself. > > So let's suppose need_resched == 0 (it's per-task of course). > As a result of wake_up_interruptible_sync() the new task is added to the > current active run-queue but need_resched remains to be unset in the hope > that the waker will call schedule() on its own soon. > > I have CONFIG_PREEMPT set on my machine but I have never encountered a bug > described by Jan. > > The catalyst of the problem, I guess, is that some IRQ interrupts a task > between wake_up_interruptible_sync() and schedule() and its ISR, in turn, > wakes up another task which prio is higher than the one of our waker (as a > result, the need_resched flag is set). And now, rescheduling occurs on > return from irq handling code (ret_from_intr -> ...-> preempt_irq_schedule() > -> schedule()).
Yes, this is exactly what happened. I unfortunately have not saved a related trace I took with the extended ipipe-tracer (the one I sent ends too early), but they showed a preemption right after the wake_up, first by one of the other real-time threads in Jeroen's scenario, and then, as a result of some xnshadow_relax() of that thread, a Linux preempt_schedule to the gatekeeper. We do not see this bug that often as it requires a specific load and it must hit a really small race window. > > Some events should coincide, yep. But I guess that problem does not occur > every time? > > I have not checked it yet but my presupposition that something as easy as : > > preempt_disable() > > wake_up_interruptible_sync(); > schedule(); > > preempt_enable(); It's a no-go: "scheduling while atomic". One of my first attempts to solve it. The only way to enter schedule() without being preemptible is via ACTIVE_PREEMPT. But the effect of that flag should be well-known now. Kind of Gordian knot. :( > > > could work... err.. and don't blame me if no, it's some one else who has > written that nonsense :o) > > -- > Best regards, > Dmitry Adamushko > Jan
Description: OpenPGP digital signature