Gilles Chanteperdrix wrote:
> Jan Kiszka wrote:
>> Jan Kiszka wrote:
>>> It's still unclear what goes on precisely, we are still digging, but the
>>> test system that can produce this is highly contended.
>> Short update: Further instrumentation revealed that cr3 differs from
>> active_mm->pgd while we are looping over that fault, ie. the kernel
>> tries to fixup the wrong mm. And that means we have some open race
>> window between updating cr3 and active_mm somewhere (isn't switch_mm run
>> in a preemptible manner now?).
> 
> Maybe the rsp is wrong and leads you to the wrong active_mm ?
> 
>> As a first shot I disabled CONFIG_IPIPE_DELAYED_ATOMICSW, and we are now
>> checking if it makes a difference. Digging deeper into the code in the
>> meanwhile...
> 
> As you have found out in the mean time, we do not use unlocked context
> switches on x86.
> 

Yes.

The last question I asked myself (but couldn't answer yet due to other
activity) was: Where are the local_irq_disable/enable_hw around
switch_mm for its Linux callers?

Jan

-- 
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux

_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Reply via email to