Philippe Gerum wrote:
> On Tue, 2010-04-27 at 10:25 +0200, Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> On Tue, 2010-04-27 at 08:46 +0200, Jan Kiszka wrote:
>>>> Philippe Gerum wrote:
>>>>> On Mon, 2010-04-26 at 18:06 +0200, Jan Kiszka wrote:
>>>>>> Jan Kiszka wrote:
>>>>>>> Jan Kiszka wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I'm meditating over an oops in rpi_clear_remote. NULL pointer deref, it
>>>>>>>> /seems/ like thread->rpi is invalid. Looking at the code, I wonder if
>>>>>>>> this could explain the bug:
>>>>>>>>
>>>>>>>>
>>>>>>>> static void rpi_clear_remote(struct xnthread *thread)
>>>>>>>> {
>>>>>>>> ...
>>>>>>>>         rpi = thread->rpi;
>>>>>>>>         if (unlikely(rpi == NULL))
>>>>>>>>                 return;
>>>>>>>>
>>>>>>>>         xnlock_get_irqsave(&rpi->rpilock, s);
>>>>>>>>
>>>>>>>>         /*
>>>>>>>>          * The RPI slot - if present - is always valid, and won't
>>>>>>>>          * change since the thread is resuming on this CPU and cannot
>>>>>>>>          * migrate under our feet. We may grab the remote slot lock
>>>>>>>>          * now.
>>>>>>>>          */
>>>>>>>>         xnsched_pop_rpi(thread);
>>>>>>>>         thread->rpi = NULL;
>>>>>>>>
>>>>>>>> ...
>>>>>>>>
>>>>>>>> So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, but
>>>>>>>> we check for it without any protection? Sounds racy. I think 'thread' 
>>>>>>>> is
>>>>>>>> not only pointing to the current thread but could refer to a foreign 
>>>>>>>> one
>>>>>>>> as well, right? Don't we need this:
>>>>>>>>
>>>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>>>>>>> index 872c37f..1f995d6 100644
>>>>>>>> --- a/ksrc/nucleus/shadow.c
>>>>>>>> +++ b/ksrc/nucleus/shadow.c
>>>>>>>> @@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread 
>>>>>>>> *thread)
>>>>>>>>  
>>>>>>>>        xnlock_get_irqsave(&rpi->rpilock, s);
>>>>>>>>  
>>>>>>>> +      /* Re-check under lock, someone may have cleared rpi by now. */
>>>>>>>> +      if (unlikely(thread->rpi == NULL)) {
>>>>>>>> +              xnlock_put_irqrestore(&rpi->rpilock, s);
>>>>>>>> +              return;
>>>>>>>> +      }
>>>>>>>> +
>>>>>>>>        /*
>>>>>>>>         * The RPI slot - if present - is always valid, and won't
>>>>>>>>         * change since the thread is resuming on this CPU and cannot
>>>>>>> Another worry: Can thread->rpi become != rpi without being NULL? Or can
>>>>>>> we really only race for clearance here?
>>>>>>>
>>>>>> I think so now, therefore I'm proposing this:
>>>>>>
>>>>>> ----------->
>>>>>>
>>>>>> Most RPI services work on the current task or the one to be scheduled in
>>>>>> next, thus are naturally serialized. But rpi_next is not as it can walk
>>>>>> the chain of RPI requests for a CPU independently. In that case,
>>>>>> clearing RPI via rpi_clear_remote can race with rpi_next, and if the
>>>>>> former loses after checking thread->rpi for NULL, we will dereference a
>>>>>> NULL pointer in xnsched_pop_rpi().
>>>>>>
>>>>>> Signed-off-by: Jan Kiszka <jan.kis...@siemens.com>
>>>>>> ---
>>>>>>  ksrc/nucleus/shadow.c |    9 +++++++++
>>>>>>  1 files changed, 9 insertions(+), 0 deletions(-)
>>>>>>
>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>>>>> index 872c37f..cf7c08f 100644
>>>>>> --- a/ksrc/nucleus/shadow.c
>>>>>> +++ b/ksrc/nucleus/shadow.c
>>>>>> @@ -332,6 +332,15 @@ static void rpi_clear_remote(struct xnthread 
>>>>>> *thread)
>>>>>>          xnlock_get_irqsave(&rpi->rpilock, s);
>>>>>>  
>>>>>>          /*
>>>>>> +         * Re-check under lock. Someone may have invoked rpi_next and 
>>>>>> cleared
>>>>>> +         * rpi by now.
>>>>>> +         */
>>>>>> +        if (unlikely(!rpi_p(thread))) {
>>>>>> +                xnlock_put_irqrestore(&rpi->rpilock, s);
>>>>>> +                return;
>>>>>> +        }
>>>>>> +
>>>>>> +        /*
>>>>>>           * The RPI slot - if present - is always valid, and won't
>>>>>>           * change since the thread is resuming on this CPU and cannot
>>>>>>           * migrate under our feet. We may grab the remote slot lock
>>>>>>
>>>>> The suggested patch papers over the actual issue, which is that
>>>>> rpi_clear_remote() may not invoke rpi_next(), because it may only affect
>>>> I don't think that in our case rpi_clear_remote called rpi_next and
>>>> therefore crashed. It should rather have been the scenario of both
>>>> running in parallel on different CPUs, the former on behalf of a
>>>> migrated shadow that wants to clear its remainders on the remote CPU,
>>>> the latter on that CPU, picking a new top RPI after some other shadow
>>>> was removed from the queue. Is this a possible scenario, and would your
>>>> patch cover it?
>>>>
>>>>> the RPI state of the argument thread which must be a local one, and not
>>>>> that of any random thread that happens to be linked to the remote RPI
>>>>> queue.
>>>>>
>>>>> By calling rpi_next(), rpi_clear_remote() shoots itself in the foot,
>>>>> allowing a concurrent invocation of itself on a remote CPU, to fiddle
>>>>> with the rpi backlink of a thread which is not active on the
>>>>> local/per-cpu Xenomai scheduler instance, which is the point where
>>>>> things start to hit the crapper.
>>>>>
>>>>> Now, unless I can't even synchronize the couple of neurons I have left
>>>>> at this hour, the following patch should better fix the issue, because
>>>>> it restores the two basic rules that apply to the whole RPI machinery,
>>>>> namely:
>>>>>
>>>>> - rpi_* calls may only alter the contents of the local scheduler's RPI
>>>>> queue, with the notable exception of rpi_clear_remote() which may remove
>>>>> the given _local_ thread only, from a remote RPI slot.
>>>>>
>>>>> - rpi_* calls may only change the RPI state of threads which are
>>>>> controlled by the local Xenomai scheduler instance, except rpi_push()
>>>>> when called for setting up the RPI state of an emerging thread, in which
>>>>> case this is a no-conflict zone.
>>>>>
>>>>> That breakage was introduced in the early 2.5.1 timeframe, so 2.4.x
>>>>> should be immune from this particular bug.
>>>>>
>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>>>> index 872c37f..1397ed1 100644
>>>>> --- a/ksrc/nucleus/shadow.c
>>>>> +++ b/ksrc/nucleus/shadow.c
>>>>> @@ -340,7 +340,12 @@ static void rpi_clear_remote(struct xnthread *thread)
>>>>>   xnsched_pop_rpi(thread);
>>>>>   thread->rpi = NULL;
>>>>>  
>>>>> - if (rpi_next(rpi, s) == NULL)
>>>>> + /*
>>>>> +  * If the remote RPI queue was emptied, prepare for kicking
>>>>> +  * xnshadow_rpi_check() on the relevant CPU to demote the root
>>>>> +  * thread priority there.
>>>>> +  */
>>>>> + if (xnsched_peek_rpi(rpi) == NULL) /* No rpi_next() here. */
>>>>>           rcpu = xnsched_cpu(rpi);
>>>>>  
>>>>>   xnlock_put_irqrestore(&rpi->rpilock, s);
>>>>>
>>>>>
>>>> I have to confess, I do not understand how the patch may relate to our
>>>> crash. But that's because I still only have a semi-understanding of this
>>>> frightening complex RPI code. However, the fact that thread->rpi is now
>>>> again only checked outside its protecting lock leaves me with a very bad
>>>> feeling...
>>>>
>>> My point is that we should not have to protect a section of code which
>>> may never conflict in any case, by design; we will likely agree that
>>> sprinkling locks everywhere to get a warm and fuzzy feeling is no
>>> solution, it's actually a significant source of regression.
>>>
>>> The idea, behind keeping most rpi_* operations applicable to locally
>>> scheduled threads, is to introduce such a design, even when remote RPI
>>> slots are involved. thread->sched == xnpod_current_sched() for each
>>> rpi_*(sched, ...) calls is what is important in this logic. Another
>>> original assumption was that no RPI updates could be done in interrupt
>>> context, which is now wrong due to the change in xnshadow_rpi_check().
>>>
>>> In short: we have to make sure that rpi_next() does not break the basic
>>> assumptions of the RPI core, first.
>> Please check my scenario again: My concern is that a thread can be
>> queued for a short while on a remote sched,
> 
> No, it can't, that is the crux of the matter, well, at least, this
> should not be possible if the basic assumptions are preserved (have a
> look at the rpi_clear_remote() callers, the target thread may not
> migrate or be scheduled in linux-wise, or exit RPI via rpi_pop() during
> the call -- all places where the rpi backlink may be cleared). Only a
> caller operating from the local CPU should be allowed to alter the RPI
> state of threads linked to the RPI slot of that same CPU.
> 
> rpi_clear_remote() is not even an exception to this, since it alters a
> remote RPI slot, but for a thread which does run on the local CPU.
> 
>>  and while that is the case,
>> it can be manipulated (/wrt ->rpi) _concurrently_ (as we do not hold the
>> remote rpilock all the time). I'm quite sure now that your patch does
>> not change this.
> 
> My patch attempts fixing the core issue, not just plugging one of its
> bad outcomes.
> 
> Again, the point is not to pretend that your patch is wrong, and it
> surely "plugs" one issue we have due to rpi_next(). The point is to make
> sure that all issues are covered, by fixing the usage of rpi_next(), or
> find another way to fix what rpi_next() was supposed to fix in the first
> place.

So, if you are right, we could (in theory) replace rpilock with local
IRQ disabling? That would be the proof from me that it doesn't matter to
test thread->rpi outside the lock.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Reply via email to