On Thu, 2007-07-19 at 17:35 +0200, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > On Thu, 2007-07-19 at 14:40 +0200, Jan Kiszka wrote:
> >> Philippe Gerum wrote:
> >>>> And when looking at the holders of rpilock, I think one issue could be
> >>>> that we hold that lock while calling into xnpod_renice_root , ie.
> >>>> doing a potential context switch. Was this checked to be save?
> >>> xnpod_renice_root() does no reschedule immediately on purpose, we would
> >>> never have been able to run any SMP config more than a couple of seconds
> >>> otherwise. (See the NOSWITCH bit).
> >> OK, then it's not the cause.
> >>>> Furthermore, that code path reveals that we take nklock nested into
> >>>> rpilock . I haven't found a spot for the other way around (and I hope
> >>>> there is none)
> >>> xnshadow_start().
> >> Nope, that one is not holding nklock. But I found an offender...
> > Gasp. xnshadow_renice() kills us too.
> Looks like we are approaching mainline "qualities" here - but they have
> at least lockdep (and still face nasty races regularly).
We only have a 2-level locking depth at most, thare barely qualifies for
being compared to the situation with mainline. Most often, the more
radical the solution, the less relevant it is: simple nesting on very
few levels is not bad, bugous nesting sequence is.
> As long as you can't avoid nesting or the inner lock only protects
> really, really trivial code (list manipulation etc.), I would say there
> is one lock too much... Did I mention that I consider nesting to be
> evil? :-> Besides correctness, there is also an increasing worst-case
> behaviour issue with each additional nesting level.
In this case, we do not want the RPI manipulation to affect the
worst-case of all other threads by holding the nklock. This is
fundamentally a migration-related issue, which is a situation that must
not impact all other contexts relying on the nklock. Given this, you
need to protect the RPI list and prevent the scheduler data to be
altered at the same time, there is no cheap trick to avoid this.
We need to keep the rpilock, otherwise we would have significantly large
latency penalties, especially when domain migration are frequent, and
yes, we do need RPI, otherwise the sequence for emulated RTOS services
would be plain wrong (e.g. task creation).
Ok, the rpilock is local, the nesting level is bearable, let's focus on
putting this thingy straight.
Xenomai-core mailing list