Philippe Gerum wrote:
Jan Kiszka wrote:
Jan Kiszka wrote:
I'm afraid this one is serious: let the attached migration stress test
run on likely any Xenomai since 2.0, preferably with
CONFIG_XENO_OPT_DEBUG on. Will give a nice crash sooner or later (I'm
trying to set up a serial console now).
Confirmed here. My test box went through some nifty triple salto out of
the window running this frag for 2mn or so. Actually, the semop
handshake is not even needed to cause the crash. At first sight, it
looks like a migration issue taking place during the critical phase when
a shadow thread switches back to Linux to terminate.
As it took some time to persuade my box to not just reboot but to give a
message, I'm posting here the kernel dump of the P-III running
Xenomai: starting native API services.
ce649fb4 ce648000 00000b17 00000202 c0139246 cdf2819c cdf28070 0b12d310
00000037 ce648000 00000000 c02f0700 00009a28 00000000 b7e94a70
00000000 ce648000 c0102fcb b7e94a70 bfed63dc b7faf4b0 bfed63c8
Xenomai: fatal: blocked thread migration rescheduled?!
(status=0x300010, sig=0, prev=watchdog/0)
This babe is awaken by Linux while Xeno sees it in a dormant state,
likely after it has terminated. No wonder why things are going wild
after that... Ok, job queued. Thanks.
CPU PID PRI TIMEOUT STAT NAME
0 0 0 0 00500080 ROOT
0 22175 1 0 00300110 migration
cea05ee4 d0842c62 cdcb0000 cea6d030 c02f0700 c035cbec c02f0700 00000286
c0139246 00000022 c02f0700 cdf28070 cdf28070 00000022 00000001
cea6d030 cdf28070 cea6d158 cea05f78 c02b26c0 cea04000 00000238
Fixed. The cause was related to the thread migration routine to primary
mode (xnshadow_harden), which would spuriously call the Linux
rescheduling procedure from the primary domain under certain
circumstances. This bug only triggers on preemptible kernels. This also
fixes the spinlock recursion issue which is sometimes triggered when the
spinlock debug option is active.
Xenomai-core mailing list