On Tue, 2006-08-01 at 16:45 +0200, Jan Kiszka wrote:
> Philippe Gerum wrote:
> >> Still, reinitializing X while the latency test runs causes
> >> the latter to hang, albeit LOC is still flowing properly and the box
> >> keeps going normally.
> > 
> > This one was due to the nucleus watchdog which triggered right after the
> > graphic mode was fully initialized, due to the huge amount of
> > unpreemptible time spent doing this; this caused the sampling task to be
> > detected as a runaway thread. So the behaviour is ok, albeit a bit
> > frightening at first.
> > 
> That reminds of the unfortunate characteristics of the 2.6 oom-killer:
> unless you set your time-critical app's oom_adj to -17, you are never
> really safe from being killed accidentally on low-mem scenarios.
> What about introducing some mechanism to protect audited tasks against
> the watchdog? A simple thread flag settable via existing APIs, ignored
> if there is no watchdog compiled in?

There is a fundamental difference between the OOM killer and the Xenomai
watchdog: the latter is merely a debugging tool to prevent the box to
hang, and you can disable it completely.

The situations reported by the watchdog are pathological ones, which
involve more than 4 seconds of continuous real-time activity while the
Linux kernel is being totally starved from CPU, and in such a case, you
really want someone to pull the brake, regardless of the consequences on
the application (which looks like basically toast anyway). IOW, if such
weird situation eventually ends up being considered as "normal" under
certain circumstances, the best approach is simply to disable the
watchdog entirely.

Limiting the runtime quantum allotted to threads through a dedicated
scheduling policy would be a better way to deal with CPU overconsumption
"intelligently", i.e. on a per-thread basis. OTOH, the current watchdog
implementation is aiming at being terminally dumb for the sake of debug


Xenomai-core mailing list

Reply via email to