Jan Kiszka wrote:
> bad news, everyone :(. According to the result of some lengthy debug
> session with a customer and several ad-hoc lttng instrumentations, we
> have a fatal bug in the nucleus' implementation of the lock stealing
> algorithm. Consider this scenario:
> 1. Thread A acquires Mutex X successfully, ie. it leaves the (in this
> case) rt_mutex_acquire service, and its XNWAKEN flag is therefore
> 2. Thread A blocks on some further Mutex Y (in our case it was a
> semaphore, but that doesn't matter).
> 3. Thread B signals the availability of Mutex Y to Thread A, thus it
> also set XNWAKEN in Thread A. But Thread A is not yet scheduled on
> its CPU.
> 4. Thread C tries to acquire Mutex X, finds it assigned to Thread A, but
> also notices that the XNWAKEN flag of Thread A is set. Thus it steals
> the mutex although Thread A already entered the critical section -
> and hell breaks loose...
See commit #3795, and change log entry from 2008-05-15. Unless I misunderstood
your description, this bug was fixed in 2.4.4.
> Looks like the XNWAKEN flag is misplaced in the owner's thread flags.
> Can we safely move it into the synch object?
Xenomai-core mailing list