Jan Kiszka wrote:
> Take a step back and look at the root cause for this issue again. Unlocked
> if need-resched
> is inherently racy and will always be (not only for the remote
> reschedule case BTW).
Ok, let us examine what may happen with this code if we only set the
XNRESCHED bit on the local cpu. First, other bits than XNRESCHED do not
matter, because they can not change under our feet. So, we have two
cases for this race:
1- we see the XNRESCHED bit, but it has been cleared once nklock is
locked in __xnpod_schedule.
2- we do not see the XNRESCHED bit, but it get set right after we test it.
1 is not a problem.
2 is not a problem, because anything which sets the XNRESCHED (it may
only be an interrupt in fact) bit will cause xnpod_schedule to be called
right after that.
So no, no race here provided that we only set the XNRESCHED bit on the
So we either have to accept this and remove the
> debugging check from the scheduler or push the check back to
> __xnpod_schedule where it once came from. When this it cleaned up, we
> can look into the remote resched protocol again.
The problem of the debug check is that it checks whether the scheduler
state is modified without the XNRESCHED bit being set. And this is the
problem, because yes, in that case, we have a race: the scheduler state
may be modified before the XNRESCHED bit is set by an IPI.
If we want to fix the debug check, we have to have a special bit, on in
the sched->status flag, only for the purpose of debugging. Or remove the
Xenomai-core mailing list