Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Gilles Chanteperdrix wrote:
>>>> However, I do not have a strong opinion on this, it is just an open
>>>> question. More generally, I would like us to discuss once and for all
>>>> about the semantic of the various calls and their effect on the RT_TASK
>>>> duration, instead of changing this semantic every release and risk
>>>> breaking non-broken applications (I mean, the one which do not segfault).
>>> To pick up this issue again (in order to get my queue flushed):
>>>
>>> We basically have to decide about the question what rt_task_delete
>>> invalidates and what impact this shall have on rt_task_join. It is
>>> already documented that rt_task_delete invalidates (and releases) the
>>> kernel-side resources of a RT_TASK. The question is what shall happen to
>>> the not explicitly mentioned user-side resources (ie. the pthread -
>>> where available).
>>>
>>> Option 1 is to decouple both and keep the user side of a joinable
>>> RT_TASK alive until it is explicitly joined. Option 2 could be to
>>> declare both parts invalid on rt_task_delete. Based on this decision,
>>> the finalization logic of rt_task_delete and rt_task_join then needs to
>>> be adjusted to deliver the right behavior, including proper error codes
>>> instead of sporadic SEGV.
>> Relying on the contents of the RT_TASK structure to know the state of a
>> task is bound to fail: the RT_TASK structure may be copied around, so
>> changing the contents of the RT_TASK structure in rt_task_delete, to use
>> that information later will only work if the same RT_TASK structure is
>> used later. This is fragile.
> 
> That's true but somehow the best we can do to detect errors that remain
> fuzzy otherwise. We neither have a list of all user space RT_TASK
> structs nor any in-kernel object to ask after rt_task_delete or join.
> 
>>> Do we expect applications to rely on this joinability after
>>> rt_task_delete? If yes, we should make it official, document the
>>> descriptor split and the fact that the descriptor cannot be looked up
>>> anymore after deletion but has to be saved beforehand.
>>>
>>> Independently, we need to clarify that cross-process join is not
>>> supported. Trying to do this ATM will result in a SEGV (something I
>>> missed so far).
>> This is a regression. At some point in the past, a NULL pthread_t opaque
>> pointer was used to mean that the thread was living in a different
>> process, and rt_task_delete would skip the pthread_cancel.
>>
> 
> I was talking about rt_task_join on a foreign RT_TASK. And I was wrong,
> it actually works with and without my patch SEGV-free. It just lacks
> documentation.
> 
> But you did not address the core questions.

Xenomai libraries rely on glibc services for the
creation/deletion/joining of threads. It happens that when we misuse
Xenomai services, we end up misusing glibc services, and the glibc
developers chose, in that case to have a segmentation fault. So, I would
say, the behaviour you do not like comes from glibc, not Xenomai. If
there was a simple way to workaround this behaviour I would say go for
it, but we now realize that working around it correctly requires
overkill solutions. So, no, I will not merge an half-working workaround,
if you want the issue properly fixed, fix it in the glibc. But I doubt
it will be easy to convince the glibc developers to add some code to
handle nicely a case which only happens when the libc is misused.

As for the rt_task_delete/rt_task_join question, I think we should have
to call rt_task_join after deleting  a thread, because that is the only
way to make sure that all the ressources associated to a thread are free.

> 
> Jan
> 


-- 
                                            Gilles.

_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Reply via email to