On 2011-04-14 16:09, Philippe Gerum wrote:
> On Thu, 2011-04-14 at 15:46 +0200, Jesper Christensen wrote:
>
>> Actually i have been running with CONFIG_XENO_HW_UNLOCKED_SWITCH the
>> whole time
>>
> You mean enabled?
>
Disabled, sorry.
>
>> and i also raised the stack size from 4k to 8k. I do however
>> think there could be some fishyness in entry_32.S. In
>> "transfer_to_handler" SPRN_SPRG3 is used to check for stack overflow (at
>> least in my kernel 2.6.29.6), but i must admit i haven't seen any of
>> that in the kernel log.
>>
>>
> Mmm, you are right. In any case, what we want with the unmasked switch
> feature is to allow interrupts while we flush the tlb and set the new mm
> context, which may be lengthy on some low end platforms. Allowing the
> switch code to be preempted during the register swap is of no use wrt
> latency.
>
> Do you have a patch at hand which you could post that flips MSR_EE in
> rthal_thread_switch already?
>
>
This protects the whole function, but it should flip the bit inside like
you suggest.
diff --git a/include/asm-powerpc/bits/pod.h b/include/asm-powerpc/bits/pod.h
old mode 100644
new mode 100755
index 6269907..e279647
--- a/include/asm-powerpc/bits/pod.h
+++ b/include/asm-powerpc/bits/pod.h
@@ -106,6 +106,7 @@ static inline void xnarch_switch_to(xnarchtcb_t
*out_tcb,
struct mm_struct *prev_mm = out_tcb->active_mm, *next_mm;
struct task_struct *prev = out_tcb->active_task;
struct task_struct *next = in_tcb->user_task;
+ unsigned long flags;
if (likely(next != NULL)) {
in_tcb->active_task = next;
@@ -156,12 +157,14 @@ static inline void xnarch_switch_to(xnarchtcb_t
*out_tcb,
#endif /* PPC32 */
#endif /* !__IPIPE_FEATURE_HARDENED_SWITCHMM */
+ rthal_local_irq_save_hw(flags);
#ifdef CONFIG_PPC64
rthal_thread_switch(out_tcb->tsp, in_tcb->tsp, next == NULL);
#else
rthal_thread_switch(out_tcb->tsp, in_tcb->tsp);
#endif
barrier();
+ rthal_local_irq_restore_hw(flags);
}
>> /Jesper
>>
>>
>> On 2011-04-14 15:31, Philippe Gerum wrote:
>>
>>> On Thu, 2011-04-14 at 15:04 +0200, Jesper Christensen wrote:
>>>
>>>
>>>> I wrote about some problems concerning stack corruption when running
>>>> xenomai on ppc. I have found out that if i disable hardware interrupts
>>>> while running "rthal_thread_switch" the problem seems to dissapear
>>>> somewhat. I saw a crash yesterday after running for 3 hours, and i'm
>>>> currently running a test (has been running for 3 hours). Usually it
>>>> would fail after 30-40 minutes. My question is: could there be a problem
>>>> if we receive an interrupt between updating the stack pointer and the
>>>> sprg3 register with the new thread pointer?
>>>>
>>>>
>>>>
>>> Normally, there should not be any issue (famous last words), since we
>>> would run Xenomai-only code over the preempted context, and we don't
>>> depend on SPRG3 to fetch the current phys address. In fact, at this
>>> stage we simply don't care about the linux context, only referring to
>>> the current Xenomai thread, which is obtained differently.
>>>
>>> Try switching off CONFIG_XENO_HW_UNLOCKED_SWITCH, in the "machine"
>>> config area, if this ends up being rock-solid, then this would be a hint
>>> that something may be fishy in this area. Raising your k-thread stack
>>> sizes in a separate test may be interesting to check too, if not already
>>> done.
>>>
>>>
>>>
>>>
>>>> /Jesper
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Xenomai-core mailing list
>>>> [email protected]
>>>> https://mail.gna.org/listinfo/xenomai-core
>>>>
>>>>
>>>
>>>
>>
>
_______________________________________________
Xenomai-core mailing list
[email protected]
https://mail.gna.org/listinfo/xenomai-core