Philippe Gerum wrote:
> Jan Kiszka wrote:
>> Hi,
>> bad news, everyone :(. According to the result of some lengthy debug
>> session with a customer and several ad-hoc lttng instrumentations, we
>> have a fatal bug in the nucleus' implementation of the lock stealing
>> algorithm. Consider this scenario:
>> 1. Thread A acquires Mutex X successfully, ie. it leaves the (in this
>>    case) rt_mutex_acquire service, and its XNWAKEN flag is therefore
>>    cleared.
>> 2. Thread A blocks on some further Mutex Y (in our case it was a
>>    semaphore, but that doesn't matter).
>> 3. Thread B signals the availability of Mutex Y to Thread A, thus it
>>    also set XNWAKEN in Thread A. But Thread A is not yet scheduled on
>>    its CPU.
>> 4. Thread C tries to acquire Mutex X, finds it assigned to Thread A, but
>>    also notices that the XNWAKEN flag of Thread A is set. Thus it steals
>>    the mutex although Thread A already entered the critical section -
>>    and hell breaks loose...
> See commit #3795, and change log entry from 2008-05-15. Unless I misunderstood
> your description, this bug was fixed in 2.4.4.

Oh, fatally missed that fix.

Anyway, the patch looks a bit unclean to me. Either you are lacking
wwake = NULL in xnpod_suspend_thread, or the whole information encoded
in XNWAKEN can already be covered by wwake directly.


Attachment: signature.asc
Description: OpenPGP digital signature

Xenomai-core mailing list

Reply via email to