Jan Kiszka wrote:
> Gilles Chanteperdrix wrote:
>> Jan Kiszka wrote:
>>> Jan Kiszka wrote:
>>>> It's still unclear what goes on precisely, we are still digging, but the
>>>> test system that can produce this is highly contended.
>>> Short update: Further instrumentation revealed that cr3 differs from
>>> active_mm->pgd while we are looping over that fault, ie. the kernel
>>> tries to fixup the wrong mm. And that means we have some open race
>>> window between updating cr3 and active_mm somewhere (isn't switch_mm run
>>> in a preemptible manner now?).
>> Maybe the rsp is wrong and leads you to the wrong active_mm ?
>>> As a first shot I disabled CONFIG_IPIPE_DELAYED_ATOMICSW, and we are now
>>> checking if it makes a difference. Digging deeper into the code in the
>> As you have found out in the mean time, we do not use unlocked context
>> switches on x86.
> The last question I asked myself (but couldn't answer yet due to other
> activity) was: Where are the local_irq_disable/enable_hw around
> switch_mm for its Linux callers?
Ha, that's the point: only activate_mm is protected, but we have more
spots in 2.6.29 and maybe other kernels, too!
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
Xenomai-core mailing list