Re: [Xenomai-core] [PULL] native: Fix msendq fastlock leakage

Jan Kiszka Mon, 20 Jun 2011 10:09:09 -0700

On 2011-06-19 15:00, Gilles Chanteperdrix wrote:
> On 06/19/2011 01:17 PM, Gilles Chanteperdrix wrote:
>> On 06/19/2011 12:14 PM, Gilles Chanteperdrix wrote:
>>> I am working on this ppd cleanup issue again, I am asking for help to
>>> find a fix in -head for all cases where the sys_ppd is needed during
>>> some cleanup.
>>>
>>> The problem is that when the ppd cleanup is invoked:
>>> - we have no guarantee that current is a thread from the Xenomai
>>> application;
>>> - if it is, current->mm is NULL.
>>>
>>> So, associating the sys_ppd to either current or current->mm does not
>>> work. What we could do is pass the sys_ppd to all the other ppds cleanup
>>> handlers, this would fix cases such as freeing mutexes fastlock, but
>>> that does not help when the sys_ppd is needed during a thread deletion hook.
>>>
>>> I would like to find a solution where simply calling xnsys_ppd_get()
>>> will work, where we do not have an xnsys_ppd_get for each context, such
>>> as for instance xnsys_ppd_get_by_mm/xnsys_ppd_get_by_task_struct,
>>> because it would be too error-prone.
>>>
>>> Any idea anyone?
>>
>> The best I could come up with: use a ptd to store the mm currently 
>> being cleaned up, so that xnshadow_ppd_get continues to work, even
>> in the middle of a cleanup.
> 
> In order to also get xnshadow_ppd_get to work in task deletion hooks 
> (which is needed to avoid the issue at the origin of this thread), we 
> also need to set this ptd upon shadow mapping, so it is still there 
> when reaching the task deletion hook (where current->mm may be NULL). 
> Hence the patch:
> 
> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
> index b243600..6bc4210 100644
> --- a/ksrc/nucleus/shadow.c
> +++ b/ksrc/nucleus/shadow.c
> @@ -65,6 +65,11 @@ int nkthrptd;
>  EXPORT_SYMBOL_GPL(nkthrptd);
>  int nkerrptd;
>  EXPORT_SYMBOL_GPL(nkerrptd);
> +int nkmmptd;
> +EXPORT_SYMBOL_GPL(nkmmptd);
> +
> +#define xnshadow_mmptd(t) ((t)->ptd[nkmmptd])
> +#define xnshadow_mm(t) ((struct mm_struct *)xnshadow_mmptd(t))


xnshadow_mm() can now return a no longer existing mm. So no user of
xnshadow_mm should ever dereference that pointer. Thus we better change
all that user to treat the return value as a void pointer e.g.

>  
>  struct xnskin_slot {
>       struct xnskin_props *props;
> @@ -1304,6 +1309,8 @@ int xnshadow_map(xnthread_t *thread, xncompletion_t 
> __user *u_completion,
>        * friends.
>        */
>       xnshadow_thrptd(current) = thread;
> +     xnshadow_mmptd(current) = current->mm;
> +
>       rthal_enable_notifier(current);
>  
>       if (xnthread_base_priority(thread) == 0 &&
> @@ -2759,7 +2766,15 @@ static void detach_ppd(xnshadow_ppd_t * ppd)
>  
>  static inline void do_cleanup_event(struct mm_struct *mm)
>  {
> +     struct task_struct *p = current;
> +     struct mm_struct *old;
> +
> +     old = xnshadow_mm(p);
> +     xnshadow_mmptd(p) = mm;
> +
>       ppd_remove_mm(mm, &detach_ppd);
> +
> +     xnshadow_mmptd(p) = old;

I don't have the full picture yet, but that feels racy: If the context
over which we clean up that foreign mm is also using xnshadow_mmptd,
other threads in that process may dislike this temporary change.

>  }
>  
>  RTHAL_DECLARE_CLEANUP_EVENT(cleanup_event);
> @@ -2925,7 +2940,7 @@ EXPORT_SYMBOL_GPL(xnshadow_unregister_interface);
>  xnshadow_ppd_t *xnshadow_ppd_get(unsigned muxid)
>  {
>       if (xnpod_userspace_p())
> -             return ppd_lookup(muxid, current->mm);
> +             return ppd_lookup(muxid, xnshadow_mm(current) ?: current->mm);
>  
>       return NULL;
>  }
> @@ -2960,8 +2975,9 @@ int xnshadow_mount(void)
>       sema_init(&completion_mutex, 1);
>       nkthrptd = rthal_alloc_ptdkey();
>       nkerrptd = rthal_alloc_ptdkey();
> +     nkmmptd = rthal_alloc_ptdkey();
>  
> -     if (nkthrptd < 0 || nkerrptd < 0) {
> +     if (nkthrptd < 0 || nkerrptd < 0 || nkmmptd < 0) {
>               printk(KERN_ERR "Xenomai: cannot allocate PTD slots\n");
>               return -ENOMEM;
>       }
> diff --git a/ksrc/skins/posix/mutex.c b/ksrc/skins/posix/mutex.c
> index 6ce75e5..cc86852 100644
> --- a/ksrc/skins/posix/mutex.c
> +++ b/ksrc/skins/posix/mutex.c
> @@ -219,10 +219,6 @@ void pse51_mutex_destroy_internal(pse51_mutex_t *mutex,
>       xnlock_put_irqrestore(&nklock, s);
>  
>  #ifdef CONFIG_XENO_FASTSYNCH
> -     /* We call xnheap_free even if the mutex is not pshared; when
> -        this function is called from pse51_mutexq_cleanup, the
> -        sem_heap is destroyed, or not the one to which the fastlock
> -        belongs, xnheap will simply return an error. */

I think this comment is not completely obsolete. It still applies /wrt
shared/non-shared.

>       xnheap_free(&xnsys_ppd_get(mutex->attr.pshared)->sem_heap,
>                   mutex->synchbase.fastlock);
>  #endif /* CONFIG_XENO_FASTSYNCH */
> 
> 

If we can resolve that potential race, this looks like a nice solution.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

_______________________________________________
Xenomai-core mailing list
[email protected]
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [PULL] native: Fix msendq fastlock leakage

Reply via email to