Hi, the watchdog is currently broken in trunk ("zombie [...] would not die..."). In fact, it should also be broken in older versions, but only recent thread termination rework made this visible.
When a Xenomai CPU hog is caught by the watchdog, xnpod_delete_thread is invoked, causing the current thread to be set in zombie state and scheduled out. But as its Linux mate still exist, hell breaks loose once Linux tries to get rid of it (the Xenomai zombie is scheduled in again). In short: calling xnpod_delete_thread(<self>) for a shadow thread is not working, probably never worked cleanly. There are basically two approaches to fix it: The first one is to find a different way to kill (or only suspend?) the current shadow thread when the watchdog strikes. The second one brought me to another issue: Raise SIGKILL for the current thread and make sure that it can be processed by Linux (e.g. via xnpod_suspend_thread(<cpu-hog>). Unfortunately, there is no way to force a shadow thread into secondary mode to handle pending Linux signals unless that thread issues a syscall once in a while. And that raises the question if we shouldn't improve this as well while we are on it. Granted, non-broken Xenomai user space threads always issue frequent syscalls, otherwise the system would starve (and the watchdog would come around). On the other hand, delaying signals till syscall prologues is different from plain Linux behaviour... Comments, ideas? Jan
Description: OpenPGP digital signature
_______________________________________________ Xenomai-core mailing list Xenomaiemail@example.com https://mail.gna.org/listinfo/xenomai-core