Philippe Gerum wrote: > Jan Kiszka wrote: >> Hi, >> >> the watchdog is currently broken in trunk ("zombie [...] would not >> die..."). In fact, it should also be broken in older versions, but only >> recent thread termination rework made this visible. >> >> When a Xenomai CPU hog is caught by the watchdog, xnpod_delete_thread is >> invoked, causing the current thread to be set in zombie state and >> scheduled out. But as its Linux mate still exist, hell breaks loose once >> Linux tries to get rid of it (the Xenomai zombie is scheduled in again). >> In short: calling xnpod_delete_thread(<self>) for a shadow thread is not >> working, probably never worked cleanly. > > Nak, it is a regression introduced by the scheduler changes in 2.5.x. We > should > detect _any_ shadow thread that schedules out in primary mode then regains > control in secondary mode like we do in the 2.4.x series, not only _relaxing_ > shadow threads. It is perfectly valid to have the Linux task orphaned from > the > deletion of its shadow TCB until Xenomai notices the issue and reaps it; > problem > was that such regression prevented the nucleus to get the memo. > > The following patch should fix the issue: > > Index: include/asm-generic/system.h > =================================================================== > --- include/asm-generic/system.h (revision 4676) > +++ include/asm-generic/system.h (working copy) > @@ -311,6 +311,11 @@ > return !!s; > } > > +static inline int xnarch_root_domain_p(void) > +{ > + return rthal_current_domain == rthal_root_domain; > +} > + > #ifdef CONFIG_SMP > > #define xnlock_get(lock) __xnlock_get(lock XNLOCK_DBG_CONTEXT) > Index: ksrc/nucleus/pod.c > =================================================================== > --- ksrc/nucleus/pod.c (revision 4676) > +++ ksrc/nucleus/pod.c (working copy) > @@ -2137,7 +2137,7 @@ > void __xnpod_schedule(struct xnsched *sched) > { > struct xnthread *prev, *next, *curr = sched->curr; > - int zombie, switched = 0, need_resched, relaxing; > + int zombie, switched = 0, need_resched, shadow; > spl_t s; > > if (xnarch_escalate()) > @@ -2174,9 +2174,9 @@ > next, xnthread_name(next)); > > #ifdef CONFIG_XENO_OPT_PERVASIVE > - relaxing = xnthread_test_state(prev, XNRELAX); > + shadow = xnthread_test_state(prev, XNSHADOW); > #else > - (void)relaxing; > + (void)shadow; > #endif /* CONFIG_XENO_OPT_PERVASIVE */ > > if (xnthread_test_state(next, XNROOT)) { > @@ -2204,12 +2204,18 @@ > > #ifdef CONFIG_XENO_OPT_PERVASIVE > /* > - * Test whether we are relaxing a thread. In such a case, we > - * are here the epilogue of Linux' schedule, and should skip > - * xnpod_schedule epilogue. > + * Test whether we transitioned from primary mode to secondary > + * over a shadow thread. This may happen in two cases: > + * > + * 1) the shadow thread just relaxed. > + * 2) the shadow TCB has just been deleted, in which case > + * we have to reap the mated Linux side as well. > + * > + * In both cases, we are running over the epilogue of Linux's > + * schedule, and should skip our epilogue code. > */ > - if (relaxing) > - goto relax_epilogue; > + if (shadow && xnarch_root_domain_p()) > + goto shadow_epilogue; > #endif /* CONFIG_XENO_OPT_PERVASIVE */ > > switched = 1; > @@ -2252,7 +2258,7 @@ > return; > > #ifdef CONFIG_XENO_OPT_PERVASIVE > - relax_epilogue: > + shadow_epilogue: > { > spl_t ignored;
Finally makes sense and works (but your posting was corrupted). Great. > >> There are basically two approaches to fix it: The first one is to find a >> different way to kill (or only suspend?) > > Suspending the hog won't work, particularly when GDB is involved, because a > pending non-lethal Linux signal may cause the suspended shadow to resume > immediately for processing the signal, therefore defeating the purpose of the > watchdog, leading to an infinite loop. This is why we moved from suspension > to > deletion upon watchdog trigger in 2.3 (2.2 used to suspend only). Yes, that became clear to me in the meantime, too. > > the current shadow thread when >> the watchdog strikes. The second one brought me to another issue: Raise >> SIGKILL for the current thread and make sure that it can be processed by >> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately, there is >> no way to force a shadow thread into secondary mode to handle pending >> Linux signals unless that thread issues a syscall once in a while. And >> that raises the question if we shouldn't improve this as well while we >> are on it. >> >> Granted, non-broken Xenomai user space threads always issue frequent >> syscalls, otherwise the system would starve (and the watchdog would come >> around). On the other hand, delaying signals till syscall prologues is >> different from plain Linux behaviour... >> >> Comments, ideas? >> > > We probably need a two-stage approach: first record the thread was bumped out > and suspend it from the watchdog handler to give Linux a chance to run again, > then finish the work, killing it for good, next time the root thread is > scheduled in on the same CPU. That confuses me again: The watchdog issue is solved now, no? We are only left with the scenario of breaking out of a user space loop of some Xenomai thread via a Linux signal (which implies SMP - otherwise there is no chance to raise the signal...). Meanwhile I played with some light-weight approach to relax a thread that received a signal (according to do_sigwake_event). Worked, but only once due to a limitation (if not bug) of I-pipe x86: in __ipipe_run_isr, it does not handle the case that a non-root handler may alter the current domain, causing corruptions to the IPIPE_SYNC_FLAG states of the involved domains. Will try to fix this and post my signaling proposal so that this work is not lost. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux _______________________________________________ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core