On 06/23/2011 08:24 PM, Philippe Gerum wrote:
> On Thu, 2011-06-23 at 20:13 +0200, Philippe Gerum wrote:
>> On Thu, 2011-06-23 at 19:32 +0200, Gilles Chanteperdrix wrote:
>>> On 06/23/2011 01:15 PM, Jan Kiszka wrote:
>>>> On 2011-06-23 13:11, Gilles Chanteperdrix wrote:
>>>>> On 06/23/2011 11:37 AM, Jan Kiszka wrote:
>>>>>> On 2011-06-20 19:07, Jan Kiszka wrote:
>>>>>>> On 2011-06-19 15:00, Gilles Chanteperdrix wrote:
>>>>>>>> On 06/19/2011 01:17 PM, Gilles Chanteperdrix wrote:
>>>>>>>>> On 06/19/2011 12:14 PM, Gilles Chanteperdrix wrote:
>>>>>>>>>> I am working on this ppd cleanup issue again, I am asking for help to
>>>>>>>>>> find a fix in -head for all cases where the sys_ppd is needed during
>>>>>>>>>> some cleanup.
>>>>>>>>>>
>>>>>>>>>> The problem is that when the ppd cleanup is invoked:
>>>>>>>>>> - we have no guarantee that current is a thread from the Xenomai
>>>>>>>>>> application;
>>>>>>>>>> - if it is, current->mm is NULL.
>>>>>>>>>>
>>>>>>>>>> So, associating the sys_ppd to either current or current->mm does not
>>>>>>>>>> work. What we could do is pass the sys_ppd to all the other ppds 
>>>>>>>>>> cleanup
>>>>>>>>>> handlers, this would fix cases such as freeing mutexes fastlock, but
>>>>>>>>>> that does not help when the sys_ppd is needed during a thread 
>>>>>>>>>> deletion hook.
>>>>>>>>>>
>>>>>>>>>> I would like to find a solution where simply calling xnsys_ppd_get()
>>>>>>>>>> will work, where we do not have an xnsys_ppd_get for each context, 
>>>>>>>>>> such
>>>>>>>>>> as for instance xnsys_ppd_get_by_mm/xnsys_ppd_get_by_task_struct,
>>>>>>>>>> because it would be too error-prone.
>>>>>>>>>>
>>>>>>>>>> Any idea anyone?
>>>>>>>>>
>>>>>>>>> The best I could come up with: use a ptd to store the mm currently 
>>>>>>>>> being cleaned up, so that xnshadow_ppd_get continues to work, even
>>>>>>>>> in the middle of a cleanup.
>>>>>>>>
>>>>>>>> In order to also get xnshadow_ppd_get to work in task deletion hooks 
>>>>>>>> (which is needed to avoid the issue at the origin of this thread), we 
>>>>>>>> also need to set this ptd upon shadow mapping, so it is still there 
>>>>>>>> when reaching the task deletion hook (where current->mm may be NULL). 
>>>>>>>> Hence the patch:
>>>>>>>>
>>>>>>>> diff --git a/ksrc/nucleus/shadow.c b/ksrc/nucleus/shadow.c
>>>>>>>> index b243600..6bc4210 100644
>>>>>>>> --- a/ksrc/nucleus/shadow.c
>>>>>>>> +++ b/ksrc/nucleus/shadow.c
>>>>>>>> @@ -65,6 +65,11 @@ int nkthrptd;
>>>>>>>>  EXPORT_SYMBOL_GPL(nkthrptd);
>>>>>>>>  int nkerrptd;
>>>>>>>>  EXPORT_SYMBOL_GPL(nkerrptd);
>>>>>>>> +int nkmmptd;
>>>>>>>> +EXPORT_SYMBOL_GPL(nkmmptd);
>>>>>>>> +
>>>>>>>> +#define xnshadow_mmptd(t) ((t)->ptd[nkmmptd])
>>>>>>>> +#define xnshadow_mm(t) ((struct mm_struct *)xnshadow_mmptd(t))
>>>>>>>
>>>>>>> xnshadow_mm() can now return a no longer existing mm. So no user of
>>>>>>> xnshadow_mm should ever dereference that pointer. Thus we better change
>>>>>>> all that user to treat the return value as a void pointer e.g.
>>>>>>>
>>>>>>>>  
>>>>>>>>  struct xnskin_slot {
>>>>>>>>        struct xnskin_props *props;
>>>>>>>> @@ -1304,6 +1309,8 @@ int xnshadow_map(xnthread_t *thread, 
>>>>>>>> xncompletion_t __user *u_completion,
>>>>>>>>         * friends.
>>>>>>>>         */
>>>>>>>>        xnshadow_thrptd(current) = thread;
>>>>>>>> +      xnshadow_mmptd(current) = current->mm;
>>>>>>>> +
>>>>>>>>        rthal_enable_notifier(current);
>>>>>>>>  
>>>>>>>>        if (xnthread_base_priority(thread) == 0 &&
>>>>>>>> @@ -2759,7 +2766,15 @@ static void detach_ppd(xnshadow_ppd_t * ppd)
>>>>>>>>  
>>>>>>>>  static inline void do_cleanup_event(struct mm_struct *mm)
>>>>>>>>  {
>>>>>>>> +      struct task_struct *p = current;
>>>>>>>> +      struct mm_struct *old;
>>>>>>>> +
>>>>>>>> +      old = xnshadow_mm(p);
>>>>>>>> +      xnshadow_mmptd(p) = mm;
>>>>>>>> +
>>>>>>>>        ppd_remove_mm(mm, &detach_ppd);
>>>>>>>> +
>>>>>>>> +      xnshadow_mmptd(p) = old;
>>>>>>>
>>>>>>> I don't have the full picture yet, but that feels racy: If the context
>>>>>>> over which we clean up that foreign mm is also using xnshadow_mmptd,
>>>>>>> other threads in that process may dislike this temporary change.
>>>>>>>
>>>>>>>>  }
>>>>>>>>  
>>>>>>>>  RTHAL_DECLARE_CLEANUP_EVENT(cleanup_event);
>>>>>>>> @@ -2925,7 +2940,7 @@ EXPORT_SYMBOL_GPL(xnshadow_unregister_interface);
>>>>>>>>  xnshadow_ppd_t *xnshadow_ppd_get(unsigned muxid)
>>>>>>>>  {
>>>>>>>>        if (xnpod_userspace_p())
>>>>>>>> -              return ppd_lookup(muxid, current->mm);
>>>>>>>> +              return ppd_lookup(muxid, xnshadow_mm(current) ?: 
>>>>>>>> current->mm);
>>>>>>>>  
>>>>>>>>        return NULL;
>>>>>>>>  }
>>>>>>>> @@ -2960,8 +2975,9 @@ int xnshadow_mount(void)
>>>>>>>>        sema_init(&completion_mutex, 1);
>>>>>>>>        nkthrptd = rthal_alloc_ptdkey();
>>>>>>>>        nkerrptd = rthal_alloc_ptdkey();
>>>>>>>> +      nkmmptd = rthal_alloc_ptdkey();
>>>>>>>>  
>>>>>>>> -      if (nkthrptd < 0 || nkerrptd < 0) {
>>>>>>>> +      if (nkthrptd < 0 || nkerrptd < 0 || nkmmptd < 0) {
>>>>>>>>                printk(KERN_ERR "Xenomai: cannot allocate PTD slots\n");
>>>>>>>>                return -ENOMEM;
>>>>>>>>        }
>>>>>>>> diff --git a/ksrc/skins/posix/mutex.c b/ksrc/skins/posix/mutex.c
>>>>>>>> index 6ce75e5..cc86852 100644
>>>>>>>> --- a/ksrc/skins/posix/mutex.c
>>>>>>>> +++ b/ksrc/skins/posix/mutex.c
>>>>>>>> @@ -219,10 +219,6 @@ void pse51_mutex_destroy_internal(pse51_mutex_t 
>>>>>>>> *mutex,
>>>>>>>>        xnlock_put_irqrestore(&nklock, s);
>>>>>>>>  
>>>>>>>>  #ifdef CONFIG_XENO_FASTSYNCH
>>>>>>>> -      /* We call xnheap_free even if the mutex is not pshared; when
>>>>>>>> -         this function is called from pse51_mutexq_cleanup, the
>>>>>>>> -         sem_heap is destroyed, or not the one to which the fastlock
>>>>>>>> -         belongs, xnheap will simply return an error. */
>>>>>>>
>>>>>>> I think this comment is not completely obsolete. It still applies /wrt
>>>>>>> shared/non-shared.
>>>>>>>
>>>>>>>>        xnheap_free(&xnsys_ppd_get(mutex->attr.pshared)->sem_heap,
>>>>>>>>                    mutex->synchbase.fastlock);
>>>>>>>>  #endif /* CONFIG_XENO_FASTSYNCH */
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> If we can resolve that potential race, this looks like a nice solution.
>>>>>>
>>>>>> We still have to address that ordering issue I almost forgot:
>>>>>> do_cleanup_event runs before do_task_exit_event when terminating the
>>>>>> last task. The former destroys the sem heap, the latter fires the delete
>>>>>> hook which then tries to free msendq.fastlock to an invalid heap.
>>>>>>
>>>>>> Should be fixable by setting sem_heap NULL in the ppd on destroy and
>>>>>> skipping the fastlock release in __task_delete_hook if the heap pointer
>>>>>> is found like that.
>>>>>
>>>>> I do not think this can be a problem, as the do_cleanup_event will also
>>>>> destroy the threads.
>>>>
>>>> At least native tasks (but I bet that's true for al skins) aren't part
>>>> of the XNSHADOW_CLIENT_DETACH cleanup procedure.
>>>
>>> Then it is indeed a problem, as we will have another ppd issue in the
>>> task deletion callbacks. I did not notice, because the posix skin does
>>> cleanup the threads, and I simply assumed that every skin did it.
>>>
>>> What is the reason for doing this? Are the thread deletion hooks
>>> supposed to work only when called from the context of the dying thread?
>>>
>>
>> When a shadow thread self-deletes, yes. This is a pre-requisite for
>> xnshadow_unmap to work on the proper context.
> 
> Read: a shadow _always_ self-deletes, this is a pre-requisite too. So
> the first rule "deletion hook shall run over the deleted thread" always
> apply for it.

It works by chance for the posix skin then: the main thread shadow is
unmapped in the cleanup callback, but it happens to work because the
cleanup event happens to be executed most of the time over the main
thread context.

For the other threads, the xnshadow_unmap function checks that the
unmapped shadow is current, but only in a second half, which is never
executed in case of the taskexit event.

-- 
                                                                Gilles.

_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Reply via email to