Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
>>>> Jan Kiszka wrote:
>>>>> Hi,
>>>>> I'm hitting that bug check in __xnpod_schedule after
>>>>> xnintr_clock_handler issued a xnpod_schedule like this:
>>>>>   if (--sched->inesting == 0) {
>>>>>           __clrbits(sched->status, XNINIRQ);
>>>>>           xnpod_schedule();
>>>>>   }
>>>>> Either the assumption behind the bug check is no longer correct (no call
>>>>> to xnpod_schedule() without a real need), or we should check for
>>>>> __xnpod_test_resched(sched) in xnintr_clock_handler (but under nklock 
>>>>> then).
>>>>> Comments?
>>>> You probably have a real bug. This BUG_ON means that the scheduler is
>>>> about to switch context for real, whereas the resched bit is not set,
>>>> which is wrong.
>>> This happened over my 2.6.35 port - maybe some spurious IRQ enabling.
>>> Debugging further...
>> You should look for something which changes the scheduler state without
>> setting the resched bit, or for something which clears the bit without
>> taking the scheduler changes into account.
> It looks like a generic Xenomai issue on SMP boxes, though a mostly
> harmless one:
> The task that was scheduled in without XNRESCHED set locally has been
> woken up by a remote CPU. The waker requeued the task and set the
> resched condition for itself and in the resched proxy mask for the
> remote CPU. But there is at least one place in the Xenomai code where we
> drop the nklock between xnsched_set_resched and xnpod_schedule:
> do_taskexit_event (I bet there are even more). Now the resched target
> CPU runs into a timer handler, issues xnpod_schedule unconditionally,
> and happens to find the woken-up task before it is actually informed via
> an IPI.
> I think this is a harmless race, but it ruins the debug assertion
> "need_resched != 0".

Not that harmless, since without the debugging code, we would miss the
reschedule too...


Xenomai-core mailing list

Reply via email to