Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> Jan Kiszka wrote:
>>>> the watchdog strikes. The second one brought me to another issue: Raise
>>>> SIGKILL for the current thread and make sure that it can be processed by
>>>> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately, there is
>>>> no way to force a shadow thread into secondary mode to handle pending
>>>> Linux signals unless that thread issues a syscall once in a while. And
>>>> that raises the question if we shouldn't improve this as well while we
>>>> are on it.
>>>>
>>>> Granted, non-broken Xenomai user space threads always issue frequent
>>>> syscalls, otherwise the system would starve (and the watchdog would come
>>>> around). On the other hand, delaying signals till syscall prologues is
>>>> different from plain Linux behaviour...
>>>>
>>>> Comments, ideas?
>>>>
>>> We probably need a two-stage approach: first record the thread was bumped 
>>> out 
>>> and suspend it from the watchdog handler to give Linux a chance to run 
>>> again, 
>>> then finish the work, killing it for good, next time the root thread is 
>>> scheduled in on the same CPU.
>> That confuses me again: The watchdog issue is solved now, no? We are
>> only left with the scenario of breaking out of a user space loop of some
>> Xenomai thread via a Linux signal (which implies SMP - otherwise there
>> is no chance to raise the signal...).
>>
>> Meanwhile I played with some light-weight approach to relax a thread
>> that received a signal (according to do_sigwake_event). Worked, but only
>> once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr,
>> it does not handle the case that a non-root handler may alter the
>> current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the
>> involved domains. Will try to fix this and post my signaling proposal so
>> that this work is not lost.
> 
> If we go that way, I would vote for a SIGSEGV instead of the SIGKILL.
> This would allow to install a handler to dump the backtrace, or even gdb
> to be stopped at the point of the infinite loop, and a SIGSEGV handler
> is not expected to recover (well, except in cases of implementation of
> COW in user-space, but that does not fit well with real-time threads).

Yea, I also thought about such mechanism to allow gdb to catch the
problem. But for a first step I do not plan to convert the watchdog kill
mechanism.

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux

_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Reply via email to