bad news, everyone :(. According to the result of some lengthy debug
session with a customer and several ad-hoc lttng instrumentations, we
have a fatal bug in the nucleus' implementation of the lock stealing
algorithm. Consider this scenario:

1. Thread A acquires Mutex X successfully, ie. it leaves the (in this
   case) rt_mutex_acquire service, and its XNWAKEN flag is therefore

2. Thread A blocks on some further Mutex Y (in our case it was a
   semaphore, but that doesn't matter).

3. Thread B signals the availability of Mutex Y to Thread A, thus it
   also set XNWAKEN in Thread A. But Thread A is not yet scheduled on
   its CPU.

4. Thread C tries to acquire Mutex X, finds it assigned to Thread A, but
   also notices that the XNWAKEN flag of Thread A is set. Thus it steals
   the mutex although Thread A already entered the critical section -
   and hell breaks loose...

Looks like the XNWAKEN flag is misplaced in the owner's thread flags.
Can we safely move it into the synch object?


Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux

Xenomai-core mailing list

Reply via email to