Jan Kiszka wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> Hi,
>>>
>>> I'm meditating over an oops in rpi_clear_remote. NULL pointer deref, it
>>> /seems/ like thread->rpi is invalid. Looking at the code, I wonder if
>>> this could explain the bug:
>>>
>>>
>>> static void rpi_clear_remote(struct xnthread *thread)
>>> {
>>> ...
>>>         rpi = thread->rpi;
>>>         if (unlikely(rpi == NULL))
>>>                 return;
>>>
>>>         xnlock_get_irqsave(&rpi->rpilock, s);
>>>
>>>         /*
>>>          * The RPI slot - if present - is always valid, and won't
>>>          * change since the thread is resuming on this CPU and cannot
>>>          * migrate under our feet. We may grab the remote slot lock
>>>          * now.
>>>          */
>>>         xnsched_pop_rpi(thread);
>>>         thread->rpi = NULL;
>>>
>>> ...
>>>
>>> So we deref (xnsched_pop_rpi) and clear thread->rpi under rpilock, but
>>> we check for it without any protection? Sounds racy. I think 'thread' is
>>> not only pointing to the current thread but could refer to a foreign one
>>> as well, right? Don't we need this:
>>>
>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>> index 872c37f..1f995d6 100644
>>> --- a/ksrc/nucleus/shadow.c
>>> +++ b/ksrc/nucleus/shadow.c
>>> @@ -331,6 +331,12 @@ static void rpi_clear_remote(struct xnthread *thread)
>>>  
>>>     xnlock_get_irqsave(&rpi->rpilock, s);
>>>  
>>> +   /* Re-check under lock, someone may have cleared rpi by now. */
>>> +   if (unlikely(thread->rpi == NULL)) {
>>> +           xnlock_put_irqrestore(&rpi->rpilock, s);
>>> +           return;
>>> +   }
>>> +
>>>     /*
>>>      * The RPI slot - if present - is always valid, and won't
>>>      * change since the thread is resuming on this CPU and cannot
>> Another worry: Can thread->rpi become != rpi without being NULL? Or can
>> we really only race for clearance here?
>>
> 
> I think so now, therefore I'm proposing this:
> 
> ----------->
> 
> Most RPI services work on the current task or the one to be scheduled in
> next, thus are naturally serialized. But rpi_next is not as it can walk

...serialized /wrt changes of thread->rpi.

> the chain of RPI requests for a CPU independently. In that case,
> clearing RPI via rpi_clear_remote can race with rpi_next, and if the
> former loses after checking thread->rpi for NULL, we will dereference a
> NULL pointer in xnsched_pop_rpi().
> 
> Signed-off-by: Jan Kiszka <jan.kis...@siemens.com>
> ---
>  ksrc/nucleus/shadow.c |    9 +++++++++
>  1 files changed, 9 insertions(+), 0 deletions(-)
> 
> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> index 872c37f..cf7c08f 100644
> --- a/ksrc/nucleus/shadow.c
> +++ b/ksrc/nucleus/shadow.c
> @@ -332,6 +332,15 @@ static void rpi_clear_remote(struct xnthread *thread)
>       xnlock_get_irqsave(&rpi->rpilock, s);
>  
>       /*
> +      * Re-check under lock. Someone may have invoked rpi_next and cleared
> +      * rpi by now.
> +      */
> +     if (unlikely(!rpi_p(thread))) {
> +             xnlock_put_irqrestore(&rpi->rpilock, s);
> +             return;
> +     }
> +
> +     /*
>        * The RPI slot - if present - is always valid, and won't
>        * change since the thread is resuming on this CPU and cannot
>        * migrate under our feet. We may grab the remote slot lock
> 
> 

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Reply via email to