Jan Kiszka wrote:
> Am 04.11.2010 10:26, Jan Kiszka wrote:
>> Am 04.11.2010 10:16, Gilles Chanteperdrix wrote:
>>> Jan Kiszka wrote:
>>>> Take a step back and look at the root cause for this issue again. Unlocked
>>>>    if need-resched
>>>>            __xnpod_schedule
>>>> is inherently racy and will always be (not only for the remote
>>>> reschedule case BTW).
>>> Ok, let us examine what may happen with this code if we only set the
>>> XNRESCHED bit on the local cpu. First, other bits than XNRESCHED do not
>>> matter, because they can not change under our feet. So, we have two
>>> cases for this race:
>>> 1- we see the XNRESCHED bit, but it has been cleared once nklock is
>>> locked in __xnpod_schedule.
>>> 2- we do not see the XNRESCHED bit, but it get set right after we test it.
>>> 1 is not a problem.
>> Yes, as long as we remove the debug check from the scheduler code (or
>> fix it somehow). The scheduling code already catches this race.
>>> 2 is not a problem, because anything which sets the XNRESCHED (it may
>>> only be an interrupt in fact) bit will cause xnpod_schedule to be called
>>> right after that.
>>> So no, no race here provided that we only set the XNRESCHED bit on the
>>> local cpu.
>>>  So we either have to accept this and remove the
>>>> debugging check from the scheduler or push the check back to
>>>> __xnpod_schedule where it once came from. When this it cleaned up, we
>>>> can look into the remote resched protocol again.
>>> The problem of the debug check is that it checks whether the scheduler
>>> state is modified without the XNRESCHED bit being set. And this is the
>>> problem, because yes, in that case, we have a race: the scheduler state
>>> may be modified before the XNRESCHED bit is set by an IPI.
>>> If we want to fix the debug check, we have to have a special bit, on in
>>> the sched->status flag, only for the purpose of debugging. Or remove the
>>> debug check.
>> Exactly my point. Is there any benefit in keeping the debug check? The
>> code to make it work may end up as "complex" as the logic it verifies,
>> at least that's my current feeling.
> This would be the radical approach of removing the check (and cleaning
> up some bits). If it's acceptable, I would split it up properly.

This debug check saved our asses when debugging SMP issues, and I
suspect it may help debugging skin issues. So, I think we should try and
keep it.

> @@ -2162,21 +2163,21 @@ static inline void xnpod_switch_to(xnsched_t *sched,
>  static inline int __xnpod_test_resched(struct xnsched *sched)
>  {
> -     int resched = testbits(sched->status, XNRESCHED);
> +     int resched = xnsched_resched_p(sched);
>  #ifdef CONFIG_SMP
>       /* Send resched IPI to remote CPU(s). */
> -     if (unlikely(xnsched_resched_p(sched))) {
> +     if (unlikely(resched)) {

At first sight, here you are more breaking things than cleaning them.


Xenomai-core mailing list

Reply via email to