On Wed, 2007-07-25 at 08:54 +0200, M. Koehrer wrote:
> Hi Philippe,
> 
> as I have mentioned yesterday, I have applied your (first) patch to Xenomai 
> (I did not apply 
> the other additional patches).

The third one would be needed to run the same code against the trunk,
due to some difference in CPU affinity management between v2.3.x and
-devel, but I had no problem running the lockup test for six hours
without it over v2.3.x.

>  And yes, my application was running fine without freeze in an
> overnight test. Not only the tiny test application but also the complex real 
> time application that
> was the root cause for everything.
> That is really a great improvement. Will this fix end up shortly in a 
> maintenance version of Xenomai?
> I would appreciate that as this is a severe bug that should have a fix 
> published as soon as possible.
> 

Yes, it is already merged into the maintenance and -devel branch, and
v2.3.3 will be released shortly, since this is indeed a deadly bug.

> Thanks a lot for the excellent support!
> 

No problem. The lockup test you sent did make a huge difference and
actually allowed me to focus on solving the issue immediately, instead
of trying to find a way to reproduce it first. Thanks for this.

> Regards
> 
> Mathias
> 
> > Hi Philippe,
> > 
> > I have attached this patch to my application. So far it looks really good.
> > However, I leave my test running to be sure that it works.
> > 
> > Regards
> > 
> > Mathias 
> > 
> > 
> > > On Fri, 2007-07-20 at 14:16 +0200, Philippe Gerum wrote: 
> > > > On Fri, 2007-07-20 at 13:54 +0200, M. Koehrer wrote:
> > > > > Hi Philippe,
> > > > > I left my test running for a couple of hours - no freeze so far... 
> > > > > 
> > > > > However, I have to do some other stuff on this machine, I have to
> > stop
> > > the test now...
> > > > > 
> > > > 
> > > > Ok, thanks for the feedback. I will send an extended patch later today,
> > > > so that you could test it on a longer period when you see fit.
> > > 
> > > It took me a bit longer than expected, but here is a patch which
> > > addresses all the pending issues with RPI, hopefully (applies against
> > > 2.3.1 stock).
> > > 
> > > The good thing about Jan grumbling at me, is that this usually makes me
> > > look at the big picture anew. And the RPI picture was not that nice,
> > > that's a fact.
> > > 
> > > Beside the locking sequence issue, the ex-aequo #1 problem was that CPU
> > > migration of Linux tasks causing a RPI boost had some very nasty
> > > side-effects on RPI management, and would create all sort of funky
> > > situations I'm too shameful to talk about, except under the generic term
> > > of "horrendous mess".
> > > 
> > > Now, regarding the deadlock issue, suppressing the RPI-specific locking
> > > entirely would have been the best solution, but unfortunately, the
> > > migration scheme makes this out of reach, at least without resorting to
> > > some hairy and likely unreliable implementation. Therefore, the solution
> > > I came with consists of making the RPI lock a per-cpu thing, so that
> > > most RPI routines are actually grabbing a _local_ lock wrt the current
> > > CPU, those routines being allowed hold the nklock as they wish. When
> > > some per-CPU RPI lock is accessed from a remote CPU, it is guaranteed
> > > that _no nklock_ may be held nested. Actually, the remote case only
> > > occurs once, in rpi_clear_remote(), and all its callers are guaranteed
> > > to be nklock-free (a debug assertion even enforces that).
> > > 
> > > For the migration issue, the RPI transitions have been ironed out to
> > > make sure we deal properly with all the subtleties of the Linux load
> > > balancer.
> > > 
> > > Mathias, please let me know if the attached patch improves the situation
> > > on your side.
> > > 
> 
> 
> 
-- 
Philippe.



_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Reply via email to