Philippe Gerum wrote: > On Sun, 2006-12-03 at 22:09 +0100, Jan Kiszka wrote: >> Philippe Gerum wrote: >>> On Sun, 2006-12-03 at 20:32 +0100, Jan Kiszka wrote: >>>> Nicolas BLANCHARD wrote: >>>>>>>>> "Nicolas BLANCHARD" <[EMAIL PROTECTED]> 29.11 11:25 >>> >>>>>> Hello, >>>>>> >>>>>> I've tested wiith Xenomai 2.3-rc2 (adeos 1.5-02) >>>>>> and change the config : >>>>>> - CONFIG_M586 >>>>>> - disable CONFIG_INPUT_PCSPKR >>>>> (it was on module) >>>>>> - disable prio boosting (check >>>>> CONFIG_XENO_OPT_RPDISALBLE) >>>>>> and it seems to work better, one hour without blocking, it's a record >>>>>> for me. >>>>>> >>>>>> So, i will investigate to find which modification improve my problem. >>>>> After somes tests (kernel compil), it seems that prio boost is >>>>> responsable of my >>>>> problem. When it's disable (kernel option checked) my program run >>>>> correctly. >>>> Confirmed! >>>> >>>> [EMAIL PROTECTED] :/root# cat /proc/xenomai/sched >>>> CPU PID PRI PERIOD TIMEOUT STAT NAME >>>> 0 0 99 0 0 R ROOT >>>> 0 837 99 9999312 0 X TASK1 >>>> 0 838 0 10999998 0 R TASK2 >>>> >>>> So far "only" on real hardware (P-I 133) with CONFIG_M586 and (this is >>>> likely also very important) CONFIG_PREEMPT. I'm now about to check if I >>>> can migrate this problem into qemu and/or capture it with the I-pipe >>>> tracer. >>>> >>> Please also try moving task2 to the SCHED_FIFO class to see if things >>> evolve. >>> >> Here is the Xenomai scheduling sequence that leads to the deadlock. I >> raised the frequency of TASK2 a bit, and this seems to accelerate the >> lock-up. >> >> ... >>> :| *+[ 844] TASK2 1 -5061+ 4.436 xnpod_resume_thread+0x48 >>> (gatekeeper_thread+0xf7) >>> :| *+[ 827] sshd -1 -5055+ 4.015 xnpod_schedule_runnable+0x45 >>> (gatekeeper_thread+0x12e) > > Damnit. What about this? > > --- ksrc/nucleus/shadow.c~ 2006-11-19 11:09:00.000000000 +0100 > +++ ksrc/nucleus/shadow.c 2006-12-04 00:29:13.000000000 +0100 > @@ -551,9 +551,6 @@ > thread->sched = xnpod_sched_slot(cpu); > #endif /* CONFIG_SMP */ > xnpod_resume_thread(thread, XNRELAX); > -#ifndef CONFIG_XENO_OPT_RPIDISABLE > - xnpod_renice_root(XNPOD_ROOT_PRIO_BASE); > -#endif /* CONFIG_XENO_OPT_RPIDISABLE */ > #ifdef CONFIG_XENO_OPT_ISHIELD > disengage_irq_shield(); > #endif /* CONFIG_XENO_OPT_ISHIELD */ >
Stalls on relaxing TASK1: CPU PID PRI PERIOD TIMEOUT STAT NAME 0 0 99 0 0 R ROOT 0 831 99 10000000 0 R TASK1 0 832 1 11000000 0 R TASK2 My excerpt may have suggested a priority adjustment bug, but there is non. When TASK2 is resumed, the currently running root thread still has prio -1, all correct. But here is the full trace of the problematic path: > :| # [ 844] TASK2 1 -2992+ 9.954 xnpod_resume_thread+0x48 > (xnthread_periodic_handler+0x28) > :| # func -2982+ 3.368 xnpod_schedule+0xe > (xnintr_irq_handler+0xec) > :| # [ 75] gatekee -1 -2978! 13.759 xnpod_schedule+0x81 > (xnintr_irq_handler+0xec) > :| # func -2965+ 2.285 __switch_to+0xb > (xnpod_schedule+0x6ce) > :| # func -2962+ 7.879 debug_smp_processor_id+0x9 > (__switch_to+0x1e) > :| # [ 844] TASK2 1 -2955+ 7.842 xnpod_schedule+0x7a2 > (xnpod_suspend_thread+0x1e4) > :| # func -2947! 39.947 __ipipe_restore_pipeline_head+0x8 > (xnpod_wait_thread_period+0x1a8) > : + func -2907+ 2.406 __ipipe_syscall_root+0x9 > (system_call+0x20) > : + func -2904+ 2.857 __ipipe_dispatch_event+0xe > (__ipipe_syscall_root+0x73) > : + func -2902+ 4.375 hisyscall_event+0xe > (__ipipe_dispatch_event+0x82) > : + func -2897+ 2.225 xnshadow_relax+0xe > (hisyscall_event+0x1ed) > : + func -2895+ 2.330 schedule_linux_call+0xe > (xnshadow_relax+0x7d) > :| # func -2893+ 2.962 __ipipe_restore_pipeline_head+0x8 > (schedule_linux_call+0xaf) > : + func -2890+ 2.323 rthal_apc_schedule+0x8 > (schedule_linux_call+0xb9) > : + func -2887+ 3.511 __ipipe_schedule_irq+0xa > (rthal_apc_schedule+0x31) > :| + func -2884+ 2.240 __ipipe_handle_irq+0xe > (common_interrupt+0x18) > :| + func -2882+ 1.639 __ipipe_ack_common_irq+0xa > (__ipipe_handle_irq+0x80) > :| + func -2880+ 2.067 > ipipe_test_and_stall_pipeline_from+0x8 (__ipipe_ack_common_irq+0x16) > :| # func -2878+ 2.879 mask_and_ack_8259A+0xb > (__ipipe_ack_common_irq+0x3e) > :| + func -2875+ 2.812 __ipipe_dispatch_wired+0xe > (__ipipe_handle_irq+0x8a) > :| # func -2872+ 1.759 xnintr_clock_handler+0x8 > (__ipipe_dispatch_wired+0x7d) > :| # func -2870+ 1.969 xnintr_irq_handler+0xe > (xnintr_clock_handler+0x17) > :| # func -2868+ 1.864 xnpod_announce_tick+0x8 > (xnintr_irq_handler+0x39) > :| # func -2867+ 3.052 xntimer_do_tick_aperiodic+0xe > (xnpod_announce_tick+0xf) > :| # func -2863+ 2.511 xnthread_periodic_handler+0x8 > (xntimer_do_tick_aperiodic+0x7b) > :| # func -2861+ 2.789 xnpod_resume_thread+0xe > (xnthread_periodic_handler+0x28) > :| # [ 843] TASK1 99 -2858+ 7.977 xnpod_resume_thread+0x48 > (xnthread_periodic_handler+0x28) > :| # func -2850+ 2.270 xnpod_schedule+0xe > (xnintr_irq_handler+0xec) > :| # [ 844] TASK2 1 -2848+ 8.466 xnpod_schedule+0x81 > (xnintr_irq_handler+0xec) > :| # func -2839+ 1.751 __switch_to+0xb > (xnpod_schedule+0x6ce) > :| # func -2838+ 6.593 debug_smp_processor_id+0x9 > (__switch_to+0x1e) > :| # [ 843] TASK1 99 -2831+ 4.421 xnpod_schedule+0x7a2 > (xnpod_suspend_thread+0x1e4) > :| # func -2827! 17.894 __ipipe_restore_pipeline_head+0x8 > (xnpod_wait_thread_period+0x1a8) > : + func -2809+ 2.360 __ipipe_syscall_root+0x9 > (system_call+0x20) > : + func -2806+ 2.210 __ipipe_dispatch_event+0xe > (__ipipe_syscall_root+0x73) > : + func -2804+ 3.458 hisyscall_event+0xe > (__ipipe_dispatch_event+0x82) > : + func -2801+ 2.421 xnshadow_relax+0xe > (hisyscall_event+0x1ed) > : + func -2798+ 2.691 schedule_linux_call+0xe > (xnshadow_relax+0x7d) > :| # func -2796+ 1.984 __ipipe_restore_pipeline_head+0x8 > (schedule_linux_call+0xaf) > : + func -2794+ 2.631 rthal_apc_schedule+0x8 > (schedule_linux_call+0xb9) > :| # func -2791+ 2.180 xnpod_schedule_runnable+0xe > (xnshadow_relax+0xd9) > :| # [ 843] TASK1 99 -2789+ 4.315 xnpod_schedule_runnable+0x45 > (xnshadow_relax+0xd9) > :| # func -2785+ 5.593 xnpod_suspend_thread+0xe > (xnshadow_relax+0x104) > :| # func -2779+ 2.045 xnpod_schedule+0xe > (xnpod_suspend_thread+0x1e4) > :| # [ 843] TASK1 99 -2777+ 6.932 xnpod_schedule+0x81 > (xnpod_suspend_thread+0x1e4) > :| # func -2770+ 2.368 __switch_to+0xb > (xnpod_schedule+0x6ce) > :| # func -2768+ 5.736 debug_smp_processor_id+0x9 > (__switch_to+0x1e) > :| # [ 827] sshd 99 -2762+ 4.917 xnpod_schedule+0x7a2 > (xnintr_irq_handler+0xec) > :| +func -2757+ 3.744 __ipipe_walk_pipeline+0xe > (__ipipe_handle_irq+0x182) > :| +func -2753+ 2.135 __ipipe_unstall_iret_root+0x8 > (restore_raw+0x0) > : +func -2751+ 2.360 tcp_cong_avoid+0xa (tcp_ack+0x14d3) > : +func -2749+ 5.909 bictcp_cong_avoid+0xb > (tcp_cong_avoid+0x1b) > : +func -2743+ 2.496 __kfree_skb+0xa > (tcp_rcv_established+0x109) This indicates that we face an I-pipe bug: the scheduled Linux call on relaxation of TASK2 and then later TASK1 somehow gets lost (there is no rthal_apc_handler in the remaining trace). Jan
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Xenomai-help mailing list [email protected] https://mail.gna.org/listinfo/xenomai-help
