Jan Kiszka wrote:
> Hi,
> 
> the watchdog is currently broken in trunk ("zombie [...] would not
> die..."). In fact, it should also be broken in older versions, but only
> recent thread termination rework made this visible.
> 
> When a Xenomai CPU hog is caught by the watchdog, xnpod_delete_thread is
> invoked, causing the current thread to be set in zombie state and
> scheduled out. But as its Linux mate still exist, hell breaks loose once
> Linux tries to get rid of it (the Xenomai zombie is scheduled in again).
> In short: calling xnpod_delete_thread(<self>) for a shadow thread is not
> working, probably never worked cleanly.

Nak, it is a regression introduced by the scheduler changes in 2.5.x. We should 
detect _any_ shadow thread that schedules out in primary mode then regains 
control in secondary mode like we do in the 2.4.x series, not only _relaxing_ 
shadow threads. It is perfectly valid to have the Linux task orphaned from the 
deletion of its shadow TCB until Xenomai notices the issue and reaps it; 
problem 
was that such regression prevented the nucleus to get the memo.

The following patch should fix the issue:

  Index: include/asm-generic/system.h
===================================================================
--- include/asm-generic/system.h        (revision 4676)
+++ include/asm-generic/system.h        (working copy)
@@ -311,6 +311,11 @@
        return !!s;
  }

+static inline int xnarch_root_domain_p(void)
+{
+       return rthal_current_domain == rthal_root_domain;
+}
+
  #ifdef CONFIG_SMP

  #define xnlock_get(lock)              __xnlock_get(lock  XNLOCK_DBG_CONTEXT)
Index: ksrc/nucleus/pod.c
===================================================================
--- ksrc/nucleus/pod.c  (revision 4676)
+++ ksrc/nucleus/pod.c  (working copy)
@@ -2137,7 +2137,7 @@
  void __xnpod_schedule(struct xnsched *sched)
  {
        struct xnthread *prev, *next, *curr = sched->curr;
-       int zombie, switched = 0, need_resched, relaxing;
+       int zombie, switched = 0, need_resched, shadow;
        spl_t s;

        if (xnarch_escalate())
@@ -2174,9 +2174,9 @@
                   next, xnthread_name(next));

  #ifdef CONFIG_XENO_OPT_PERVASIVE
-       relaxing = xnthread_test_state(prev, XNRELAX);
+       shadow = xnthread_test_state(prev, XNSHADOW);
  #else
-       (void)relaxing;
+       (void)shadow;
  #endif /* CONFIG_XENO_OPT_PERVASIVE */

        if (xnthread_test_state(next, XNROOT)) {
@@ -2204,12 +2204,18 @@

  #ifdef CONFIG_XENO_OPT_PERVASIVE
        /*
-        * Test whether we are relaxing a thread. In such a case, we
-        * are here the epilogue of Linux' schedule, and should skip
-        * xnpod_schedule epilogue.
+        * Test whether we transitioned from primary mode to secondary
+        * over a shadow thread. This may happen in two cases:
+        *
+        * 1) the shadow thread just relaxed.
+        * 2) the shadow TCB has just been deleted, in which case
+        * we have to reap the mated Linux side as well.
+        *
+        * In both cases, we are running over the epilogue of Linux's
+        * schedule, and should skip our epilogue code.
         */
-       if (relaxing)
-               goto relax_epilogue;
+       if (shadow && xnarch_root_domain_p())
+               goto shadow_epilogue;
  #endif /* CONFIG_XENO_OPT_PERVASIVE */

        switched = 1;
@@ -2252,7 +2258,7 @@
        return;

  #ifdef CONFIG_XENO_OPT_PERVASIVE
-      relax_epilogue:
+      shadow_epilogue:
        {
                spl_t ignored;

> 
> There are basically two approaches to fix it: The first one is to find a
> different way to kill (or only suspend?)

Suspending the hog won't work, particularly when GDB is involved, because a 
pending non-lethal Linux signal may cause the suspended shadow to resume 
immediately for processing the signal, therefore defeating the purpose of the 
watchdog, leading to an infinite loop. This is why we moved from suspension to 
deletion upon watchdog trigger in 2.3 (2.2 used to suspend only).

  the current shadow thread when
> the watchdog strikes. The second one brought me to another issue: Raise
> SIGKILL for the current thread and make sure that it can be processed by
> Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately, there is
> no way to force a shadow thread into secondary mode to handle pending
> Linux signals unless that thread issues a syscall once in a while. And
> that raises the question if we shouldn't improve this as well while we
> are on it.
> 
> Granted, non-broken Xenomai user space threads always issue frequent
> syscalls, otherwise the system would starve (and the watchdog would come
> around). On the other hand, delaying signals till syscall prologues is
> different from plain Linux behaviour...
> 
> Comments, ideas?
> 

We probably need a two-stage approach: first record the thread was bumped out 
and suspend it from the watchdog handler to give Linux a chance to run again, 
then finish the work, killing it for good, next time the root thread is 
scheduled in on the same CPU.

> Jan
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Xenomai-core mailing list
> Xenomai-core@gna.org
> https://mail.gna.org/listinfo/xenomai-core


-- 
Philippe.

_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Reply via email to