Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Philippe Gerum wrote: >>> Jan Kiszka wrote: >>>> the watchdog strikes. The second one brought me to another issue: Raise >>>> SIGKILL for the current thread and make sure that it can be processed by >>>> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately, there is >>>> no way to force a shadow thread into secondary mode to handle pending >>>> Linux signals unless that thread issues a syscall once in a while. And >>>> that raises the question if we shouldn't improve this as well while we >>>> are on it. >>>> >>>> Granted, non-broken Xenomai user space threads always issue frequent >>>> syscalls, otherwise the system would starve (and the watchdog would come >>>> around). On the other hand, delaying signals till syscall prologues is >>>> different from plain Linux behaviour... >>>> >>>> Comments, ideas? >>>> >>> We probably need a two-stage approach: first record the thread was bumped >>> out >>> and suspend it from the watchdog handler to give Linux a chance to run >>> again, >>> then finish the work, killing it for good, next time the root thread is >>> scheduled in on the same CPU. >> That confuses me again: The watchdog issue is solved now, no? We are >> only left with the scenario of breaking out of a user space loop of some >> Xenomai thread via a Linux signal (which implies SMP - otherwise there >> is no chance to raise the signal...). >> >> Meanwhile I played with some light-weight approach to relax a thread >> that received a signal (according to do_sigwake_event). Worked, but only >> once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr, >> it does not handle the case that a non-root handler may alter the >> current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the >> involved domains. Will try to fix this and post my signaling proposal so >> that this work is not lost. > > If we go that way, I would vote for a SIGSEGV instead of the SIGKILL. > This would allow to install a handler to dump the backtrace, or even gdb > to be stopped at the point of the infinite loop, and a SIGSEGV handler > is not expected to recover (well, except in cases of implementation of > COW in user-space, but that does not fit well with real-time threads).
Yea, I also thought about such mechanism to allow gdb to catch the problem. But for a first step I do not plan to convert the watchdog kill mechanism. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux _______________________________________________ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core