Philippe Gerum wrote: > Jan Kiszka wrote: >> Philippe Gerum wrote: >>> Jan Kiszka wrote: >>>> Hi, >>>> >>>> the watchdog is currently broken in trunk ("zombie [...] would not >>>> die..."). In fact, it should also be broken in older versions, but only >>>> recent thread termination rework made this visible. >>>> >>>> When a Xenomai CPU hog is caught by the watchdog, >>>> xnpod_delete_thread is >>>> invoked, causing the current thread to be set in zombie state and >>>> scheduled out. But as its Linux mate still exist, hell breaks loose >>>> once >>>> Linux tries to get rid of it (the Xenomai zombie is scheduled in >>>> again). >>>> In short: calling xnpod_delete_thread(<self>) for a shadow thread is >>>> not >>>> working, probably never worked cleanly. >>> Nak, it is a regression introduced by the scheduler changes in 2.5.x. >>> We should detect _any_ shadow thread that schedules out in primary >>> mode then regains control in secondary mode like we do in the 2.4.x >>> series, not only _relaxing_ shadow threads. It is perfectly valid to >>> have the Linux task orphaned from the deletion of its shadow TCB >>> until Xenomai notices the issue and reaps it; problem was that such >>> regression prevented the nucleus to get the memo. >>> >>> The following patch should fix the issue: >>> >>> Index: include/asm-generic/system.h >>> =================================================================== >>> --- include/asm-generic/system.h (revision 4676) >>> +++ include/asm-generic/system.h (working copy) >>> @@ -311,6 +311,11 @@ >>> return !!s; >>> } >>> >>> +static inline int xnarch_root_domain_p(void) >>> +{ >>> + return rthal_current_domain == rthal_root_domain; >>> +} >>> + >>> #ifdef CONFIG_SMP >>> >>> #define xnlock_get(lock) __xnlock_get(lock XNLOCK_DBG_CONTEXT) >>> Index: ksrc/nucleus/pod.c >>> =================================================================== >>> --- ksrc/nucleus/pod.c (revision 4676) >>> +++ ksrc/nucleus/pod.c (working copy) >>> @@ -2137,7 +2137,7 @@ >>> void __xnpod_schedule(struct xnsched *sched) >>> { >>> struct xnthread *prev, *next, *curr = sched->curr; >>> - int zombie, switched = 0, need_resched, relaxing; >>> + int zombie, switched = 0, need_resched, shadow; >>> spl_t s; >>> >>> if (xnarch_escalate()) >>> @@ -2174,9 +2174,9 @@ >>> next, xnthread_name(next)); >>> >>> #ifdef CONFIG_XENO_OPT_PERVASIVE >>> - relaxing = xnthread_test_state(prev, XNRELAX); >>> + shadow = xnthread_test_state(prev, XNSHADOW); >>> #else >>> - (void)relaxing; >>> + (void)shadow; >>> #endif /* CONFIG_XENO_OPT_PERVASIVE */ >>> >>> if (xnthread_test_state(next, XNROOT)) { >>> @@ -2204,12 +2204,18 @@ >>> >>> #ifdef CONFIG_XENO_OPT_PERVASIVE >>> /* >>> - * Test whether we are relaxing a thread. In such a case, we >>> - * are here the epilogue of Linux' schedule, and should skip >>> - * xnpod_schedule epilogue. >>> + * Test whether we transitioned from primary mode to secondary >>> + * over a shadow thread. This may happen in two cases: >>> + * >>> + * 1) the shadow thread just relaxed. >>> + * 2) the shadow TCB has just been deleted, in which case >>> + * we have to reap the mated Linux side as well. >>> + * >>> + * In both cases, we are running over the epilogue of Linux's >>> + * schedule, and should skip our epilogue code. >>> */ >>> - if (relaxing) >>> - goto relax_epilogue; >>> + if (shadow && xnarch_root_domain_p()) >>> + goto shadow_epilogue; >>> #endif /* CONFIG_XENO_OPT_PERVASIVE */ >>> >>> switched = 1; >>> @@ -2252,7 +2258,7 @@ >>> return; >>> >>> #ifdef CONFIG_XENO_OPT_PERVASIVE >>> - relax_epilogue: >>> + shadow_epilogue: >>> { >>> spl_t ignored; >> >> Finally makes sense and works (but your posting was corrupted). Great. >> >>>> There are basically two approaches to fix it: The first one is to >>>> find a >>>> different way to kill (or only suspend?) >>> Suspending the hog won't work, particularly when GDB is involved, >>> because a pending non-lethal Linux signal may cause the suspended >>> shadow to resume immediately for processing the signal, therefore >>> defeating the purpose of the watchdog, leading to an infinite loop. >>> This is why we moved from suspension to deletion upon watchdog >>> trigger in 2.3 (2.2 used to suspend only). >> >> Yes, that became clear to me in the meantime, too. >> >>> the current shadow thread when >>>> the watchdog strikes. The second one brought me to another issue: Raise >>>> SIGKILL for the current thread and make sure that it can be >>>> processed by >>>> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately, >>>> there is >>>> no way to force a shadow thread into secondary mode to handle pending >>>> Linux signals unless that thread issues a syscall once in a while. And >>>> that raises the question if we shouldn't improve this as well while we >>>> are on it. >>>> >>>> Granted, non-broken Xenomai user space threads always issue frequent >>>> syscalls, otherwise the system would starve (and the watchdog would >>>> come >>>> around). On the other hand, delaying signals till syscall prologues is >>>> different from plain Linux behaviour... >>>> >>>> Comments, ideas? >>>> >>> We probably need a two-stage approach: first record the thread was >>> bumped out and suspend it from the watchdog handler to give Linux a >>> chance to run again, then finish the work, killing it for good, next >>> time the root thread is scheduled in on the same CPU. >> >> That confuses me again: The watchdog issue is solved now, no? We are >> only left with the scenario of breaking out of a user space loop of some >> Xenomai thread via a Linux signal (which implies SMP - otherwise there >> is no chance to raise the signal...). >> > > If you first suspend the hog, then send it a lethal signal, you solve > both issues: first Linux is allowed to run eventually, then your task > won't be able to resume running the faulty code, but solely to process > SIGKILL, which can be made pending early enough because the nucleus > decides when Linux resumes.
I'm not interested in SIGKILL here, rather in SIGSTOP to do debugging. That is currently impossible. > >> Meanwhile I played with some light-weight approach to relax a thread >> that received a signal (according to do_sigwake_event). Worked, but only >> once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr, >> it does not handle the case that a non-root handler may alter the >> current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the >> involved domains. > > It is not a bug, this is wanted. ISR must neither change the current > domain nor migrate CPU; allowing this would open Pandora's box. OK, then please elaborate on this a bit more in the adeos-main thread and explain why __ipipe_sync_stage currently reloads the domain. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux _______________________________________________ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core