Philippe Gerum wrote:
> On Fri, 2007-07-20 at 16:20 +0200, Jan Kiszka wrote:
>> OK, let's go through this another time, this time under the motto "get
>> the locking right". As a start (and a help for myself), here comes an
>> overview of the scheme the final version may expose - as long as there
>> are separate locks:
>> gatekeeper_thread / xnshadow_relax:
>>      rpilock, followed by nklock
>>      (while xnshadow_relax puts both under irqsave...)
> The relaxing thread must not be preempted in primary mode before it
> schedules out but after it has been linked to the RPI list, otherwise
> the root thread would benefit from a spurious priority boost. This said,
> in the UP case, we have no lock to contend for anyway, so the point of
> discussing whether we should have the rpilock or not is moot here.
>> xnshadow_unmap:
>>      nklock, then rpilock nested
> This one is the hardest to solve.
>> xnshadow_start:
>>      rpilock, followed by nklock
>> xnshadow_renice:
>>      nklock, then rpilock nested
>> schedule_event:
>>      only rpilock
>> setsched_event:
>>      nklock, followed by rpilock, followed by nklock again
>> And then there is xnshadow_rpi_check which has to be fixed to:
>>      nklock, followed by rpilock (here was our lock-up bug)
> rpilock -> nklock in fact.

Yes, meant it the other way around: The invocation of
xnpod_renice_root() must be moved out of nklock - which should be
trivial, correct?

> The last lockup was rather likely due to the
> gatekeeper's dangerous nesting of nklock -> rpilock -> nklock.

This path - as one of three with this ordering - surely triggered the
bug. But given the fact that the other two nestings of this kind are yet
unresolvable while our reversely ordered nesting in xnshadow_rpi_check
is, it is clear that the latter one is the weak point. So far we only
have a fix for Mathias' test case which stresses just a subset of all
rpilock paths appropriately.

>> That's a scheme which /should/ be safe. Unfortunately, I see no way to
>> get rid of the remaining nestings.
> There is one, which consists of getting rid of the rpilock entirely. The
> purpose of such lock is to protect the RPI list when fixing the
> situation after a task migration in secondary mode triggered from the
> Linux side. Addressing the latter issue differently may solve the
> problem more elegantly than figuring out how to combine the two locks,
> or hammering the hot path with the nklock. Will look at this.

Even the better! Looking forward.


Attachment: signature.asc
Description: OpenPGP digital signature

Xenomai-core mailing list

Reply via email to