Philippe Gerum wrote:
> On Sun, 2006-12-03 at 20:32 +0100, Jan Kiszka wrote:
>> Nicolas BLANCHARD wrote:
>>>>>>> "Nicolas BLANCHARD" <[EMAIL PROTECTED]> 29.11 11:25 >>>
>>>> Hello,
>>>>
>>>> I've tested wiith Xenomai 2.3-rc2 (adeos 1.5-02)
>>>> and change the config : 
>>>>                                        - CONFIG_M586
>>>>                                        - disable CONFIG_INPUT_PCSPKR
>>> (it was on module)
>>>>                                        - disable prio boosting (check
>>> CONFIG_XENO_OPT_RPDISALBLE)
>>>> and it seems to work better, one hour without blocking, it's a record
>>>> for me.
>>>>
>>>> So, i will investigate to find which modification improve my problem.
>>> After somes tests (kernel compil), it seems that prio boost is
>>> responsable of my
>>> problem. When it's disable (kernel option checked) my program run
>>> correctly.
>> Confirmed!
>>
>> [EMAIL PROTECTED] :/root# cat /proc/xenomai/sched
>> CPU  PID    PRI      PERIOD   TIMEOUT    STAT       NAME
>>   0  0       99      0        0          R          ROOT
>>   0  837     99      9999312  0          X          TASK1
>>   0  838      0      10999998 0          R          TASK2
>>
>> So far "only" on real hardware (P-I 133) with CONFIG_M586 and (this is
>> likely also very important) CONFIG_PREEMPT. I'm now about to check if I
>> can migrate this problem into qemu and/or capture it with the I-pipe tracer.
>>
> 
> Please also try moving task2 to the SCHED_FIFO class to see if things
> evolve.
> 

Here is the Xenomai scheduling sequence that leads to the deadlock. I
raised the frequency of TASK2 a bit, and this seems to accelerate the
lock-up.

...
> :|  *+[  844] TASK2    1 -5061+   4.436  xnpod_resume_thread+0x48 
> (gatekeeper_thread+0xf7)
> :|  *+[  827] sshd    -1 -5055+   4.015  xnpod_schedule_runnable+0x45 
> (gatekeeper_thread+0x12e)
> :|  # [  827] sshd    -1 -5015+   6.646  xnpod_schedule+0x81 
> (xnpod_schedule_handler+0x17)
> :|  # [  844] TASK2    1 -4981+   3.721  xnpod_schedule+0x81 
> (xnpod_suspend_thread+0x1e4)
> :|  # [   75] gatekee -1 -4971+   6.451  xnpod_schedule+0x7a2 
> (xnpod_schedule_handler+0x17)

So far everything is fine. Now the thrilling parts start:

> :|  # [  844] TASK2    1 -2992+   9.954  xnpod_resume_thread+0x48 
> (xnthread_periodic_handler+0x28)
> :|  # [   75] gatekee -1 -2978!  13.759  xnpod_schedule+0x81 
> (xnintr_irq_handler+0xec)
> :|  # [  844] TASK2    1 -2955+   7.842  xnpod_schedule+0x7a2 
> (xnpod_suspend_thread+0x1e4)
> :|  # [  843] TASK1   99 -2858+   7.977  xnpod_resume_thread+0x48 
> (xnthread_periodic_handler+0x28)
> :|  # [  844] TASK2    1 -2848+   8.466  xnpod_schedule+0x81 
> (xnintr_irq_handler+0xec)
> :|  # [  843] TASK1   99 -2831+   4.421  xnpod_schedule+0x7a2 
> (xnpod_suspend_thread+0x1e4)
> :|  # [  843] TASK1   99 -2789+   4.315  xnpod_schedule_runnable+0x45 
> (xnshadow_relax+0xd9)
> :|  # [  843] TASK1   99 -2777+   6.932  xnpod_schedule+0x81 
> (xnpod_suspend_thread+0x1e4)
> :|  # [  827] sshd    99 -2762+   4.917  xnpod_schedule+0x7a2 
> (xnintr_irq_handler+0xec)

The trace captured almost 200 further milliseconds, but no more
switching takes place (full dump available on request).

So we have

TASK2 resume -> TASK2 relax -> TASK1 resume/TASK2 preempted ->
TASK2 relax -> Lock-up

Gilles, are we able to produce such a sequence with the switchtest?

OK, it's time now to think a bit about what we see here. Any ideas welcome.

Jan


PS: Here are the stats you asked for, Philippe:

CPU  PID    MSW        CSW        PF    STAT       %CPU  NAME
  0  0      0          5493       0     01400080   99.1  ROOT
  0  843    646        1294       0     00c00180    0.0  TASK1
  0  844    2152       4337       0     00c00088    0.0  TASK2
  0  0      0          689962     0     00000000    0.9  IRQ0: [timer]

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Xenomai-help mailing list
[email protected]
https://mail.gna.org/listinfo/xenomai-help

Reply via email to