Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Jan Kiszka wrote: >>> It's still unclear what goes on precisely, we are still digging, but the >>> test system that can produce this is highly contended. >> Short update: Further instrumentation revealed that cr3 differs from >> active_mm->pgd while we are looping over that fault, ie. the kernel >> tries to fixup the wrong mm. And that means we have some open race >> window between updating cr3 and active_mm somewhere (isn't switch_mm run >> in a preemptible manner now?). > > Maybe the rsp is wrong and leads you to the wrong active_mm ? > >> As a first shot I disabled CONFIG_IPIPE_DELAYED_ATOMICSW, and we are now >> checking if it makes a difference. Digging deeper into the code in the >> meanwhile... > > As you have found out in the mean time, we do not use unlocked context > switches on x86. >
Yes. The last question I asked myself (but couldn't answer yet due to other activity) was: Where are the local_irq_disable/enable_hw around switch_mm for its Linux callers? Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux _______________________________________________ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core