Philippe Gerum wrote:
> On Tue, 2006-08-01 at 16:45 +0200, Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>>> Still, reinitializing X while the latency test runs causes
>>>> the latter to hang, albeit LOC is still flowing properly and the box
>>>> keeps going normally.
>>> This one was due to the nucleus watchdog which triggered right after the
>>> graphic mode was fully initialized, due to the huge amount of
>>> unpreemptible time spent doing this; this caused the sampling task to be
>>> detected as a runaway thread. So the behaviour is ok, albeit a bit
>>> frightening at first.
>> That reminds of the unfortunate characteristics of the 2.6 oom-killer:
>> unless you set your time-critical app's oom_adj to -17, you are never
>> really safe from being killed accidentally on low-mem scenarios.
>> What about introducing some mechanism to protect audited tasks against
>> the watchdog? A simple thread flag settable via existing APIs, ignored
>> if there is no watchdog compiled in?
> There is a fundamental difference between the OOM killer and the Xenomai
> watchdog: the latter is merely a debugging tool to prevent the box to
> hang, and you can disable it completely.
> The situations reported by the watchdog are pathological ones, which
> involve more than 4 seconds of continuous real-time activity while the
> Linux kernel is being totally starved from CPU, and in such a case, you
> really want someone to pull the brake, regardless of the consequences on
> the application (which looks like basically toast anyway). IOW, if such
> weird situation eventually ends up being considered as "normal" under
> certain circumstances, the best approach is simply to disable the
> watchdog entirely.
> Limiting the runtime quantum allotted to threads through a dedicated
> scheduling policy would be a better way to deal with CPU overconsumption
> "intelligently", i.e. on a per-thread basis. 

For sure, e.g. round-robin scheduling including the root thread, and
this also over aperiodic timer mode.

> OTOH, the current watchdog
> implementation is aiming at being terminally dumb for the sake of debug
> efficiency.

Yes, it's simple and it's a debugging mechanism. Nevertheless, I think
it can be improved without too much effort or costs. I would love to
demonstrate this, but for now I'm afraid this has to remain a (now
filed) idea.


Attachment: signature.asc
Description: OpenPGP digital signature

Xenomai-core mailing list

Reply via email to