On Tue, 2010-04-27 at 11:34 +0200, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > On Tue, 2010-04-27 at 11:27 +0200, Jan Kiszka wrote:
> >> Philippe Gerum wrote:
> >>> On Tue, 2010-04-27 at 10:25 +0200, Jan Kiszka wrote:
> >>>> Philippe Gerum wrote:
> >>>>> On Tue, 2010-04-27 at 08:46 +0200, Jan Kiszka wrote:
> >>>>>> Philippe Gerum wrote:
> >>>>>>> On Mon, 2010-04-26 at 18:06 +0200, Jan Kiszka wrote:
> >>>>>>>> Jan Kiszka wrote:
> >>>>>>>>> Jan Kiszka wrote:
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> I'm meditating over an oops in rpi_clear_remote. NULL pointer 
> >>>>>>>>>> deref, it
> >>>>>>>>>> /seems/ like thread->rpi is invalid. Looking at the code, I wonder 
> >>>>>>>>>> if
> >>>>>>>>>> this could explain the bug:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> static void rpi_clear_remote(struct xnthread *thread)
> >>>>>>>>>> {
> >>>>>>>>>> ...
> >>>>>>>>>>         rpi = thread->rpi;
> >>>>>>>>>>         if (unlikely(rpi == NULL))
> >>>>>>>>>>                 return;
> >>>>>>>>>>
> >>>>>>>>>>         xnlock_get_irqsave(&rpi->rpilock, s);
> >>>>>>>>>>
> >>>>>>>>>>         /*
> >>>>>>>>>>          * The RPI slot - if present - is always valid, and won't
> >>>>>>>>>>          * change since the thread is resuming on this CPU and 
> >>>>>>>>>> cannot
> >>>>>>>>>>          * migrate under our feet. We may grab the remote slot lock
> >>>>>>>>>>          * now.
> >>>>>>>>>>          */
> >>>>>>>>>>         xnsched_pop_rpi(thread);
> >>>>>>>>>>         thread->rpi = NULL;
> >>>>>>>>>>
> >>>>>>>>>> ...
> >>>>>>>>>>
> >>>>>>>>>> So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, 
> >>>>>>>>>> but
> >>>>>>>>>> we check for it without any protection? Sounds racy. I think 
> >>>>>>>>>> 'thread' is
> >>>>>>>>>> not only pointing to the current thread but could refer to a 
> >>>>>>>>>> foreign one
> >>>>>>>>>> as well, right? Don't we need this:
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >>>>>>>>>> index 872c37f..1f995d6 100644
> >>>>>>>>>> --- a/ksrc/nucleus/shadow.c
> >>>>>>>>>> +++ b/ksrc/nucleus/shadow.c
> >>>>>>>>>> @@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread 
> >>>>>>>>>> *thread)
> >>>>>>>>>>  
> >>>>>>>>>>    xnlock_get_irqsave(&rpi->rpilock, s);
> >>>>>>>>>>  
> >>>>>>>>>> +  /* Re-check under lock, someone may have cleared rpi by now. */
> >>>>>>>>>> +  if (unlikely(thread->rpi == NULL)) {
> >>>>>>>>>> +          xnlock_put_irqrestore(&rpi->rpilock, s);
> >>>>>>>>>> +          return;
> >>>>>>>>>> +  }
> >>>>>>>>>> +
> >>>>>>>>>>    /*
> >>>>>>>>>>     * The RPI slot - if present - is always valid, and won't
> >>>>>>>>>>     * change since the thread is resuming on this CPU and cannot
> >>>>>>>>> Another worry: Can thread->rpi become != rpi without being NULL? Or 
> >>>>>>>>> can
> >>>>>>>>> we really only race for clearance here?
> >>>>>>>>>
> >>>>>>>> I think so now, therefore I'm proposing this:
> >>>>>>>>
> >>>>>>>> ----------->
> >>>>>>>>
> >>>>>>>> Most RPI services work on the current task or the one to be 
> >>>>>>>> scheduled in
> >>>>>>>> next, thus are naturally serialized. But rpi_next is not as it can 
> >>>>>>>> walk
> >>>>>>>> the chain of RPI requests for a CPU independently. In that case,
> >>>>>>>> clearing RPI via rpi_clear_remote can race with rpi_next, and if the
> >>>>>>>> former loses after checking thread->rpi for NULL, we will 
> >>>>>>>> dereference a
> >>>>>>>> NULL pointer in xnsched_pop_rpi().
> >>>>>>>>
> >>>>>>>> Signed-off-by: Jan Kiszka <jan.kis...@siemens.com>
> >>>>>>>> ---
> >>>>>>>>  ksrc/nucleus/shadow.c |    9 +++++++++
> >>>>>>>>  1 files changed, 9 insertions(+), 0 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >>>>>>>> index 872c37f..cf7c08f 100644
> >>>>>>>> --- a/ksrc/nucleus/shadow.c
> >>>>>>>> +++ b/ksrc/nucleus/shadow.c
> >>>>>>>> @@ -332,6 +332,15 @@ static void rpi_clear_remote(struct xnthread 
> >>>>>>>> *thread)
> >>>>>>>>      xnlock_get_irqsave(&rpi->rpilock, s);
> >>>>>>>>  
> >>>>>>>>      /*
> >>>>>>>> +     * Re-check under lock. Someone may have invoked rpi_next and 
> >>>>>>>> cleared
> >>>>>>>> +     * rpi by now.
> >>>>>>>> +     */
> >>>>>>>> +    if (unlikely(!rpi_p(thread))) {
> >>>>>>>> +            xnlock_put_irqrestore(&rpi->rpilock, s);
> >>>>>>>> +            return;
> >>>>>>>> +    }
> >>>>>>>> +
> >>>>>>>> +    /*
> >>>>>>>>       * The RPI slot - if present - is always valid, and won't
> >>>>>>>>       * change since the thread is resuming on this CPU and cannot
> >>>>>>>>       * migrate under our feet. We may grab the remote slot lock
> >>>>>>>>
> >>>>>>> The suggested patch papers over the actual issue, which is that
> >>>>>>> rpi_clear_remote() may not invoke rpi_next(), because it may only 
> >>>>>>> affect
> >>>>>> I don't think that in our case rpi_clear_remote called rpi_next and
> >>>>>> therefore crashed. It should rather have been the scenario of both
> >>>>>> running in parallel on different CPUs, the former on behalf of a
> >>>>>> migrated shadow that wants to clear its remainders on the remote CPU,
> >>>>>> the latter on that CPU, picking a new top RPI after some other shadow
> >>>>>> was removed from the queue. Is this a possible scenario, and would your
> >>>>>> patch cover it?
> >>>>>>
> >>>>>>> the RPI state of the argument thread which must be a local one, and 
> >>>>>>> not
> >>>>>>> that of any random thread that happens to be linked to the remote RPI
> >>>>>>> queue.
> >>>>>>>
> >>>>>>> By calling rpi_next(), rpi_clear_remote() shoots itself in the foot,
> >>>>>>> allowing a concurrent invocation of itself on a remote CPU, to fiddle
> >>>>>>> with the rpi backlink of a thread which is not active on the
> >>>>>>> local/per-cpu Xenomai scheduler instance, which is the point where
> >>>>>>> things start to hit the crapper.
> >>>>>>>
> >>>>>>> Now, unless I can't even synchronize the couple of neurons I have left
> >>>>>>> at this hour, the following patch should better fix the issue, because
> >>>>>>> it restores the two basic rules that apply to the whole RPI machinery,
> >>>>>>> namely:
> >>>>>>>
> >>>>>>> - rpi_* calls may only alter the contents of the local scheduler's RPI
> >>>>>>> queue, with the notable exception of rpi_clear_remote() which may 
> >>>>>>> remove
> >>>>>>> the given _local_ thread only, from a remote RPI slot.
> >>>>>>>
> >>>>>>> - rpi_* calls may only change the RPI state of threads which are
> >>>>>>> controlled by the local Xenomai scheduler instance, except rpi_push()
> >>>>>>> when called for setting up the RPI state of an emerging thread, in 
> >>>>>>> which
> >>>>>>> case this is a no-conflict zone.
> >>>>>>>
> >>>>>>> That breakage was introduced in the early 2.5.1 timeframe, so 2.4.x
> >>>>>>> should be immune from this particular bug.
> >>>>>>>
> >>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> >>>>>>> index 872c37f..1397ed1 100644
> >>>>>>> --- a/ksrc/nucleus/shadow.c
> >>>>>>> +++ b/ksrc/nucleus/shadow.c
> >>>>>>> @@ -340,7 +340,12 @@ static void rpi_clear_remote(struct xnthread 
> >>>>>>> *thread)
> >>>>>>>       xnsched_pop_rpi(thread);
> >>>>>>>       thread->rpi = NULL;
> >>>>>>>  
> >>>>>>> -     if (rpi_next(rpi, s) == NULL)
> >>>>>>> +     /*
> >>>>>>> +      * If the remote RPI queue was emptied, prepare for kicking
> >>>>>>> +      * xnshadow_rpi_check() on the relevant CPU to demote the root
> >>>>>>> +      * thread priority there.
> >>>>>>> +      */
> >>>>>>> +     if (xnsched_peek_rpi(rpi) == NULL) /* No rpi_next() here. */
> >>>>>>>               rcpu = xnsched_cpu(rpi);
> >>>>>>>  
> >>>>>>>       xnlock_put_irqrestore(&rpi->rpilock, s);
> >>>>>>>
> >>>>>>>
> >>>>>> I have to confess, I do not understand how the patch may relate to our
> >>>>>> crash. But that's because I still only have a semi-understanding of 
> >>>>>> this
> >>>>>> frightening complex RPI code. However, the fact that thread->rpi is now
> >>>>>> again only checked outside its protecting lock leaves me with a very 
> >>>>>> bad
> >>>>>> feeling...
> >>>>>>
> >>>>> My point is that we should not have to protect a section of code which
> >>>>> may never conflict in any case, by design; we will likely agree that
> >>>>> sprinkling locks everywhere to get a warm and fuzzy feeling is no
> >>>>> solution, it's actually a significant source of regression.
> >>>>>
> >>>>> The idea, behind keeping most rpi_* operations applicable to locally
> >>>>> scheduled threads, is to introduce such a design, even when remote RPI
> >>>>> slots are involved. thread->sched == xnpod_current_sched() for each
> >>>>> rpi_*(sched, ...) calls is what is important in this logic. Another
> >>>>> original assumption was that no RPI updates could be done in interrupt
> >>>>> context, which is now wrong due to the change in xnshadow_rpi_check().
> >>>>>
> >>>>> In short: we have to make sure that rpi_next() does not break the basic
> >>>>> assumptions of the RPI core, first.
> >>>> Please check my scenario again: My concern is that a thread can be
> >>>> queued for a short while on a remote sched,
> >>> No, it can't, that is the crux of the matter, well, at least, this
> >>> should not be possible if the basic assumptions are preserved (have a
> >>> look at the rpi_clear_remote() callers, the target thread may not
> >>> migrate or be scheduled in linux-wise, or exit RPI via rpi_pop() during
> >>> the call -- all places where the rpi backlink may be cleared). Only a
> >>> caller operating from the local CPU should be allowed to alter the RPI
> >>> state of threads linked to the RPI slot of that same CPU.
> >>>
> >>> rpi_clear_remote() is not even an exception to this, since it alters a
> >>> remote RPI slot, but for a thread which does run on the local CPU.
> >>>
> >>>>  and while that is the case,
> >>>> it can be manipulated (/wrt ->rpi) _concurrently_ (as we do not hold the
> >>>> remote rpilock all the time). I'm quite sure now that your patch does
> >>>> not change this.
> >>> My patch attempts fixing the core issue, not just plugging one of its
> >>> bad outcomes.
> >>>
> >>> Again, the point is not to pretend that your patch is wrong, and it
> >>> surely "plugs" one issue we have due to rpi_next(). The point is to make
> >>> sure that all issues are covered, by fixing the usage of rpi_next(), or
> >>> find another way to fix what rpi_next() was supposed to fix in the first
> >>> place.
> >> So, if you are right, we could (in theory) replace rpilock with local
> >> IRQ disabling? That would be the proof from me that it doesn't matter to
> >> test thread->rpi outside the lock.
> > 
> > Noo, you did not get it yet. The rpilock is protecting the per-cpu RPI
> > queues, NOT thread->rpi. 
> 
> So linking a thread to an RPI queue and setting its ->rpi are not always
> related?

Exactly. Normally, a remote CPU should not be allowed to compete with
another one for updating the rpi backlink of a thread it does NOT own,
meaning not linked to the local scheduler. 

My understanding is that rpi_next() is deliberately breaking that rule
when called from rpi_clear_remote() for a non-local scheduler, therefore
altering rpi backlinks of threads which do NOT belong to the local
scheduler.

Who is wrong then? Well, I tend to think that rpi_next() should not be
called in the only cross-CPU context we have, namely rpi_clear_remote().
In that very particular case, where we own the target thread, and we
want to complete its recent migration to our local CPU.

This is why I was insisting on the notion of calling context for all
rpi* calls, with respect to what threads may be touched by a given CPU. 

But, I am now worried by xnshadow_rpi_check(), which - by calling
rpi_next() instead of simply peeking at the RPI queue head - introduces
a potential race wrt the rpi backlink, when we deal with
rpi_clear_remote() on the _same_ CPU.

> 
> Jan
> 


-- 
Philippe.



_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Reply via email to