Jan Kiszka wrote:
> Am 04.11.2010 00:56, Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Am 04.11.2010 00:44, Gilles Chanteperdrix wrote:
>>>> Jan Kiszka wrote:
>>>>> Am 04.11.2010 00:18, Gilles Chanteperdrix wrote:
>>>>>> Jan Kiszka wrote:
>>>>>>> Am 04.11.2010 00:11, Gilles Chanteperdrix wrote:
>>>>>>>> Jan Kiszka wrote:
>>>>>>>>> Am 03.11.2010 23:11, Jan Kiszka wrote:
>>>>>>>>>> Am 03.11.2010 23:03, Jan Kiszka wrote:
>>>>>>>>>>> But we not not always use atomic ops for manipulating status bits 
>>>>>>>>>>> (but
>>>>>>>>>>> we do in other cases where this is no need - different story). This 
>>>>>>>>>>> may
>>>>>>>>>>> fix the race:
>>>>>>>>>> Err, nonsense. As we manipulate xnsched::status also outside of 
>>>>>>>>>> nklock
>>>>>>>>>> protection, we must _always_ use atomic ops.
>>>>>>>>>>
>>>>>>>>>> This screams for a cleanup: local-only bits like XNHTICK or XNINIRQ
>>>>>>>>>> should be pushed in a separate status word that can then be safely
>>>>>>>>>> modified non-atomically.
>>>>>>>>> Second try to fix and clean up the sched status bits. Anders, please
>>>>>>>>> test.
>>>>>>>>>
>>>>>>>>> Jan
>>>>>>>>>
>>>>>>>>> diff --git a/include/nucleus/pod.h b/include/nucleus/pod.h
>>>>>>>>> index 01ff0a7..5987a1f 100644
>>>>>>>>> --- a/include/nucleus/pod.h
>>>>>>>>> +++ b/include/nucleus/pod.h
>>>>>>>>> @@ -277,12 +277,10 @@ static inline void xnpod_schedule(void)
>>>>>>>>>        * context is active, or if we are caught in the middle of a
>>>>>>>>>        * unlocked context switch.
>>>>>>>>>        */
>>>>>>>>> -#if XENO_DEBUG(NUCLEUS)
>>>>>>>>>       if (testbits(sched->status, XNKCOUT|XNINIRQ|XNSWLOCK))
>>>>>>>>>               return;
>>>>>>>>> -#else /* !XENO_DEBUG(NUCLEUS) */
>>>>>>>>> -     if (testbits(sched->status,
>>>>>>>>> -                  XNKCOUT|XNINIRQ|XNSWLOCK|XNRESCHED) != XNRESCHED)
>>>>>>>>> +#if !XENO_DEBUG(NUCLEUS)
>>>>>>>>> +     if (!sched->resched)
>>>>>>>>>               return;
>>>>>>>>>  #endif /* !XENO_DEBUG(NUCLEUS) */
>>>>>>>> Having only one test was really nice here, maybe we simply read a
>>>>>>>> barrier before reading the status?
>>>>>>>>
>>>>>>> I agree - but the alternative is letting all modifications of
>>>>>>> xnsched::status use atomic bitops (that's required when folding all bits
>>>>>>> into a single word). And that should be much more costly, specifically
>>>>>>> on SMP.
>>>>>> What about issuing a barrier before testing the status?
>>>>>>
>>>>> The problem is not about reading but writing the status concurrently,
>>>>> thus it's not about the code you see above.
>>>> The bits are modified under nklock, which implies a barrier when
>>>> unlocked. Furthermore, an IPI is guaranteed to be received on the remote
>>>> CPU after this barrier, so, a barrier should be enough to see the
>>>> modifications which have been made remotely.
>>> Check nucleus/intr.c for tons of unprotected status modifications.
>> Ok. Then maybe, we should reconsider the original decision to start
>> fiddling with the XNRESCHED bit remotely.
> 
> ...which removed complexity and fixed a race? Let's better review the
> checks done in xnpod_schedule vs. its callers, I bet there is more to
> save (IOW: remove the need to test for sched->resched).

Not that much complexitiy... and the race was a false positive in debug
code, no big deal. At least it worked, and it has done so for a long
time. No atomic needed, no barrier, only one test in xnpod_schedule. And
a nice invariant: sched->status is always accessed on the local cpu.
What else?


-- 
                                                                Gilles.

_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Reply via email to