Philippe Gerum wrote: > On Mon, 2006-12-04 at 22:06 +0100, Jan Kiszka wrote: >> Jan Kiszka wrote: >>> ... >>> This indicates that we face an I-pipe bug: the scheduled Linux call on >>> relaxation of TASK2 and then later TASK1 somehow gets lost (there is no >>> rthal_apc_handler in the remaining trace). >> I think I got it. No I-pipe bug, but one in the HAL. >> >> What happened? A weird race caused by the unprotected optimisation to >> only call rthal_schedule_irq() if there is no APC pending yet. This is >> the constellation I finally worked out via instrumenting and tracing: >> >> PRIO 1: >> rthal_apc_schedule() >> test&set rthal_apc_pending >> (but no rthal_schedule_irq() yet) >> >> -PREEMPTION- >> >> PRIO 99: >> ... >> rthal_apc_schedule() >> test rthal_apc_pending >> (already set => no >> rthal_schedule_irq()!) >> >> So, no one reported the ACP to I-pipe, and no one ever will in Nicolas >> scenario - soft lock-up! >> >> Nicolas, please give the attached patch a try. Your test is running fine >> for me now. >> >> >> At this chance: do we need rthal_apc_schedule() returning the previous >> state at all? No current caller checks the return value. If it's OK to >> clean this up, I will post a combined patch. > > It's pointless to return this info, the caller would not be able to > exploit it in any way. > > Could you please try this other form of your fix? This would prevent the > virq to be fired twice (or more) uselessly, with no request pending. > Masking interrupts to enforce atomicity is ok here, since > rthal_schedule_irq() will run this way in any case. Additionally, my > comment about migration was suspicious, I do see a case where CPU > migration would leave this code in limbos. But to get the cpuid > properly, we need the local interrupts to be masked first. TIA,
Yes, works fine as well. Then go for the attached fix+cleanup? Jan
Index: ksrc/arch/generic/hal.c
===================================================================
--- ksrc/arch/generic/hal.c (Revision 1920)
+++ ksrc/arch/generic/hal.c (Arbeitskopie)
@@ -565,7 +565,7 @@ int rthal_apc_free(int apc)
/**
* @fn int rthal_apc_schedule (int apc)
- *
+ *
* @brief Schedule an APC invocation.
*
* This service marks the APC as pending for the Linux domain, so that
@@ -580,8 +580,7 @@ int rthal_apc_free(int apc)
*
* @param apc The APC id. to schedule.
*
- * @return 0 or 1 are returned upon success, the former meaning that
- * the APC was already pending. Otherwise:
+ * @return 0 is returned upon success. Otherwise:
*
* - -EINVAL is returned if @a apc is invalid.
*
@@ -596,18 +595,21 @@ int rthal_apc_free(int apc)
int rthal_apc_schedule(int apc)
{
rthal_declare_cpuid;
+ unsigned long flags;
if (apc < 0 || apc >= RTHAL_NR_APCS)
return -EINVAL;
- rthal_load_cpuid(); /* Migration would be harmless here. */
+ rthal_local_irq_save(flags);
+
+ rthal_load_cpuid();
- if (!test_and_set_bit(apc, &rthal_apc_pending[cpuid])) {
+ if (!__test_and_set_bit(apc, &rthal_apc_pending[cpuid]))
rthal_schedule_irq(rthal_apc_virq);
- return 1;
- }
- return 0; /* Already pending. */
+ rthal_local_irq_restore(flags);
+
+ return 0;
}
#ifdef CONFIG_PROC_FS
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Xenomai-help mailing list [email protected] https://mail.gna.org/listinfo/xenomai-help
