Philippe Gerum wrote:
> Philippe Gerum wrote:
>> Jan Kiszka wrote:
>>
>>> [a few interruptions later]
>>>
>>> Jan Kiszka wrote:
>>>
>>>> Rodrigo Rosenfeld Rosas wrote:
>>>>
>>>>> BTW, please, could someone confirm the rt_task_delete(NULL) bug in
>>>>> SVN?
>>>>
>>>>
>>>> Half-confirmed, there is something fishy. I'm struggling with the
>>>> debugger ATM, not sure yet who's wrong ;). It tells me
>>>> rt_task_delete of
>>>> the skin module is entered with task != NULL...
>>>
>>>
>>>
>>> ...which turns out to be fine, just appears redundant to me when
>>> comparing __rt_task_delete and rt_task_delete for the task=NULL case.
>>>
>>> Anyway, leaving a native task with rt_task_delete(NULL) raises SIGKILL
>>> to the whole process instead of just the task (pthread). This lets your
>>> program terminate unexpectedly - I would say: a bug. And this doesn't
>>> happen with 2.1?
>>>
>>
>> It's a side-effect of a recent bug fix in ksrc/nucleus/shadow.c; now
>> killing
> 
> Er, "deleting" is the right word here. Sending a thread a termination
> signal must kill the entire process as per POSIX, and will continue to
> do so. Calling rt_task_delete() to explicitely delete a single thread
> from within the containing process is another story. The current issue
> is due to the fact that no distinction is made on the caller:
> rt_task_delete() targeting a thread from another process should wipe out
> the entire target process; otherwise, only the local target thread
> should be deleted. It's not clear whether we should still wipe out the
> entire process when the target thread is not the current one, regardless
> of the fact such thread is a member of the same process or not.
> I'm open to suggestions.

Killing other threads within the same process currently only works due
to pthread_cancel. I don't see a portable equivalent for foreign
processes yet as well. :-/

I guess the thread termination signal sent by pthread_cancel depends on
glibc internals, specifically its variant (NTPL or linux-threads),
doesn't it? Didn't we already have this discussion??

For now I would say the best we can do is to avoid the
rt_task_delete(NULL) side effect in userspace (as I suggested) and live
with the limitation of terminating the whole process when using the
(rather unusual) cross-process rt_task_delete.

> 
>  a thread raises a group signal wiping out the entire process.
>> Ok, it's a bit drastic, will fix.
>>
>>> I guess the easiest way to solve this is to catch NULL in userspace and
>>> call pthread_exit() in favour of the skin service (the POSIX skin uses
>>> pthread_exit anyway), see attached patch. Someone just has to confirm
>>> that there will be no problem hidden by this approach.
>>
>>
>> Passing NULL needs to work including from user-space; the kernel-space
>> is ok with this, and the API must behave the same way regardless of
>> the execution space. Should fix as needed.
>>
>>>
>>> Jan
>>>
>>>
>>> PS: What's the reason for "if (err == -ESRCH) return 0" in
>>> src/skins/native/task.c, rt_task_delete? Why is that error generate in
>>> the first place if it is zeroed out here?
>>>

<attention: unanswered question above> ;)

Jan

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Reply via email to