Re: [Xenomai-core] [bug] don't try this at home...

Philippe Gerum Fri, 16 Dec 2005 21:56:41 +0100

Philippe Gerum wrote:

Philippe Gerum wrote:
Jan Kiszka wrote:
Jan Kiszka wrote:
Hi Philippe,

I'm afraid this one is serious: let the attached migration stress test
run on likely any Xenomai since 2.0, preferably with
CONFIG_XENO_OPT_DEBUG on. Will give a nice crash sooner or later (I'm
trying to set up a serial console now).
Confirmed here. My test box went through some nifty triple salto outof the window running this frag for 2mn or so. Actually, the semophandshake is not even needed to cause the crash. At first sight, itlooks like a migration issue taking place during the critical phasewhen a shadow thread switches back to Linux to terminate.
As it took some time to persuade my box to not just reboot but to give a
message, I'm posting here the kernel dump of the P-III running
nat_migration:

[...]
Xenomai: starting native API services.
ce649fb4 ce648000 00000b17 00000202 c0139246 cdf2819c cdf28070 0b12d310
       00000037 ce648000 00000000 c02f0700 00009a28 00000000 b7e94a70
bfed63c8
       00000000 ce648000 c0102fcb b7e94a70 bfed63dc b7faf4b0 bfed63c8
00000000
Call Trace:
 [<c0139246>] __ipipe_dispatch_event+0x96/0x130
 [<c0102fcb>] work_resched+0x6/0x1c
Xenomai: fatal: blocked thread migration[22175] rescheduled?!
(status=0x300010, sig=0, prev=watchdog/0[3])
This babe is awaken by Linux while Xeno sees it in a dormant state,likely after it has terminated. No wonder why things are going wildafter that... Ok, job queued. Thanks.
 CPU  PID    PRI  TIMEOUT  STAT      NAME
0  0      0    0        00500080  ROOT
   0  22175  1    0        00300110  migration
Timer: none

cea05ee4 d0842c62 cdcb0000 cea6d030 c02f0700 c035cbec c02f0700 00000286
       c0139246 00000022 c02f0700 cdf28070 cdf28070 00000022 00000001
c02f0700
       cea6d030 cdf28070 cea6d158 cea05f78 c02b26c0 cea04000 00000238
d1244537
Call Trace:
 [<c0139246>] __ipipe_dispatch_event+0x96/0x130
 [<c02b26c0>] schedule+0x2d0/0x720
 [<c0137b20>] watchdog+0x0/0x80
 [<c02b3967>] schedule_timeout+0x47/0xb0
 [<c0120070>] process_timeout+0x0/0x10
 [<c0120492>] msleep_interruptible+0x42/0x60
 [<c0137b70>] watchdog+0x50/0x80
 [<c012d0ab>] kthread+0x8b/0x90
 [<c012d020>] kthread+0x0/0x90
 [<c0100ef5>] kernel_thread_helper+0x5/0x10
Fixed. The cause was related to the thread migration routine to primarymode (xnshadow_harden), which would spuriously call the Linuxrescheduling procedure from the primary domain under certaincircumstances. This bug only triggers on preemptible kernels. This alsofixes the spinlock recursion issue which is sometimes triggered when thespinlock debug option is active.

Gasp. I've found a severe regression with this fix, so more work isneeded. More later.


--

Philippe.

Re: [Xenomai-core] [bug] don't try this at home...

Reply via email to