On 03/06/2012 05:45 PM, Gilles Chanteperdrix wrote:
On 03/06/2012 04:14 AM, Oscar Dávila wrote:
2012/3/2 Gilles Chanteperdrix<[email protected]>

On 03/03/2012 01:14 AM, Oscar Dávila wrote:
2012/3/2 Gilles Chanteperdrix<[email protected]>

On 03/02/2012 11:04 AM, Gilles Chanteperdrix wrote:
On 03/01/2012 05:23 AM, Oscar Dávila wrote:
Finally i could get the dump


post-prompt
No breakpoints or watchpoints.

breakpoints-table-end

post-prompt
Dump of assembler code for function __ipipe_sync_stage:
0xc106d376<__ipipe_sync_stage+0>:   push   %ebp
(...)
0xc106d526<__ipipe_sync_stage+432>: ret
End of assembler dump.

The address where the EIP is when the NMI watchdog triggers is
0xc106d5e1, so, outside this code.

And this dump does not seem to correspond to the kernel that was running
when the bug happened, because in that case we had

0xc106d5e1 == __ipipe_sync_stage + 0x21b

whereas in your dump,

__ipipe_sync_stage + 0x21b == 0xc106d591

Sorry about that, i lost that image of the kernel.

Here is a new complete test.

Kernel Messages


Kernel failure message 1:
BUG: NMI Watchdog detected LOCKUP on CPU0, ip c10751d3, registers:

                       local_irq_disable_hw();
c10751bf:     fa                      cli
c10751c0:     89 e0                   mov    %esp,%eax
c10751c2:     25 00 e0 ff ff          and    $0xffffe000,%eax
                       root_stall_after_handler();
                       while (__ipipe_check_root_resched())
c10751c7:     83 78 14 00             cmpl   $0x0,0x14(%eax)
c10751cb:     75 58                   jne    c1075225<__xirq_end+0x2>
c10751cd:     f6 40 08 08             testb  $0x8,0x8(%eax)
c10751d1:     74 52                   je     c1075225<__xirq_end+0x2>
c10751d3:     eb f8                   jmp    c10751cd
<__ipipe_sync_stage+0x12b>
                               __ipipe_preempt_schedule_irq();

Looks like an infinite loop when CONFIG_PREEMPT is off. Try putting an
#ifdef CONFIG_PREEMPT around this code:

#ifdef CONFIG_PREEMPT
                        while (__ipipe_check_root_resched())
                                __ipipe_preempt_schedule_irq();
#endif

To test that this is indeed the issue, you may try enabling
CONFIG_PREEMPT in the code.



I recompiled the kernel enabling CONFIG_PREEMPT and it worked, also i tried
the other option, where i add the #ifdef CONFIG_PREEMPT to the source of
core.c, and it also worked.

So it seems that was the problem. Now i can run trivial_periodic.

But after some time with the kernel after running trivial_periodic, the
machines still freezes, i will try to see where the failure is happening
now.

Which type of preemption model its preferred? i mean, using the
CONFIG_PREEMPT enabled or without the:
                        while (__ipipe_check_root_resched())
                                __ipipe_preempt_schedule_irq();

We should not need either workaround. From reading the code, I do not
understand why the compiler creates this infinite loop. It would be
interesting to generate the pre-processed file to understand how this
happens.


Because CONFIG_PREEMPT is disabled, but __ipipe_check_root_resched() is instantiated. This can't fly.

--
Philippe.

_______________________________________________
Xenomai-help mailing list
[email protected]
https://mail.gna.org/listinfo/xenomai-help

Reply via email to