On Sun, 2007-10-07 at 18:40 +0200, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > On Sun, 2007-10-07 at 17:27 +0200, Jan Kiszka wrote:
> >> This patch fixes another bug of I-pipe for 2.6.22:
> >>
> >> Due to the introduction of a pgd page cache (quicklist) into that
> >> kernel,  __ipipe_pin_range_globally no longer addressed all spots that
> >> need to be updated after vmalloc'ed memory was mapped into the kernel
> >> address range. The result was that, after inserting modular Xenomai, new
> >> application sometimes received an outdated pgd from the quicklist, and
> >> the next timer IRQ triggered a minor fault over xeno_nucleus. As
> >> handling faults inside non-root domains with the Linux handler doesn't
> >> fly, the box blew up sooner or later.
> >>
> > 
> > Good spot. This said, the page cache is fairly old stuff, introduced a
> > long time ago and already present in 2.6.10, so this means that all
> > patches featuring the on-demand mapping disable support do have the same
> > problem.
> 
> Indeed. But somehow the switch to quicklist or some other pieces of
> 2.6.22 must have changed the preconditions of this issue. I'm using
> Xenomai in modular form since ages on my notebook but only got that
> lockups over 2.6.22.

We've been pretty lucky it seems, or most users end up compiling the
support statically.

>  Anyway, so we should back-port my patch and also
> spread it to the other archs.

When applicable, yes.

> 
> > 
> >> So I've reworked __ipipe_pin_range_globally, basing it on pgd_list, the
> >> list of all pgd pages (in use or cached) in the system, and folding
> >> __ipipe_pin_range_mapping into it. That makes __ipipe_pin_range_globally
> >> an arch-specific thing from now on.
> >>
> >> So far the quicklist is only biting us on i386, but I would suggest to
> >> check if/how we can apply this new pattern on other archs as well.
> >>
> >> Jan
> >>
> >> PS: UP is now stable with latest Xenomai here, but SMP unfortunately
> >> still misbehaves (I suspect host timer issues).
> >>
> > 
> > I still have a problem with UP here, but this one is due to a Xenomai
> > bug -- host timer is no more forwarded when the nucleus timer starts.
> > Does disabling NOHZ & HIRES get things working on your setup?
> > 
> 
> Yes, I have HIRES on, and I guess that's the point: My current
> impression is that there are some bits in Xenomai missing to migrate
> running hires timers from Linux's lapic clockevent device over xntimers. 
> The effect here is that CPU0 continues (probably due to higher timer
> load) while CPU1 stops scheduling timers:
> 
> CPU  SCHEDULED   FIRED       TIMEOUT    INTERVAL   HANDLER      NAME
> 0    2729        2727        31168      -          NULL         [host-timer/0]
> 0    11          10          305103844  1000000000  xnpod_watch  [watchdog]
> 1    11          10          309365472  1000000000  xnpod_watch  [watchdog]
> 

The issue I see would be different it seems. I can reproduce the problem
in UP + PIT mode, LAPIC off.

> Jan
> 
-- 
Philippe.



_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Reply via email to