Re: [Xenomai-core] [PATCH] Fix host IRQ propagation

Gilles Chanteperdrix Thu, 14 May 2009 05:54:18 -0700

Philippe Gerum wrote:
> On Thu, 2009-05-14 at 12:20 +0200, Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> On Wed, 2009-05-13 at 18:10 +0200, Jan Kiszka wrote:
>>>> Philippe Gerum wrote:
>>>>> On Wed, 2009-05-13 at 17:28 +0200, Jan Kiszka wrote:
>>>>>> Philippe Gerum wrote:
>>>>>>> On Wed, 2009-05-13 at 15:18 +0200, Jan Kiszka wrote:
>>>>>>>> Gilles Chanteperdrix wrote:
>>>>>>>>> Jan Kiszka wrote:
>>>>>>>>>> Hi Gilles,
>>>>>>>>>>
>>>>>>>>>> I'm currently facing a nasty effect with switchtest over latest git 
>>>>>>>>>> head
>>>>>>>>>> (only tested this so far): running it inside my test VM (ie. with
>>>>>>>>>> frequent excessive latencies) I get a stalled Linux timer IRQ quite
>>>>>>>>>> quickly. System is otherwise still responsive, Xenomai timers are 
>>>>>>>>>> still
>>>>>>>>>> being delivered, other Linux IRQs too. switchtest complained about
>>>>>>>>>>
>>>>>>>>>>     "Warning: Linux is compiled to use FPU in kernel-space."
>>>>>>>>>>
>>>>>>>>>> when it was started. Kernels are 2.6.28.9/ipipe-x86-2.2-07 and
>>>>>>>>>> 2.6.29.3/ipipe-x86-2.3-01 (LTTng patched in, but unused), both show 
>>>>>>>>>> the
>>>>>>>>>> same effect.
>>>>>>>>>>
>>>>>>>>>> Seen this before?
>>>>>>>>> The warning about Linux being compiled to use FPU in kernel-space 
>>>>>>>>> means
>>>>>>>>> that you enabled soft RAID or compiled for K7, Geode, or any other
>>>>>>>> RAID is on (ordinary server config).
>>>>>>>>
>>>>>>>>> configuration using 3DNow for such simple operations as memcpy. It is
>>>>>>>>> harmless, it simply means that switchtest can not use fpu in 
>>>>>>>>> kernel-space.
>>>>>>>>>
>>>>>>>>> The bug you have is probably the same as the one described here, 
>>>>>>>>> which I
>>>>>>>>> am able to reproduce on my atom:
>>>>>>>>> https://mail.gna.org/public/xenomai-help/2009-04/msg00200.html
>>>>>>>>>
>>>>>>>>> Unfortunately, I for one am working on ARM issues and am not available
>>>>>>>>> to debug x86 issues. I think Philippe is busy too...
>>>>>>>> OK, looks like I got the same flu here.
>>>>>>>>
>>>>>>>> Philippe, did you find out any more details in the meantime? Then I'm
>>>>>>>> afraid I have to pick this up.
>>>>>>> No, I did not resume this task yet. Working from the powerpc side of the
>>>>>>> universe here.
>>>>>> Hoho, don't think this rain here over x86 would have never made it down
>>>>>> to ARM or PPC land! ;)
>>>>>>
>>>>>> Martin, could you check if this helps you, too?
>>>>>>
>>>>>> Jan
>>>>>>
>>>>>> (as usual, ready to be pulled from 'for-upstream')
>>>>>>
>>>>>> --------->
>>>>>>
>>>>>> Host IRQs may not only be triggered from non-root domains.
>>>>> Are you sure of this? I can't find any spot where this assumption would
>>>>> be wrong. host_pend() is basically there to relay RT timer ticks and
>>>>> device IRQs, and this only happens on behalf of the pipeline head. At
>>>>> least, this is how rthal_irq_host_pend() should be used in any case. If
>>>>> you did find a spot where this interface is being called from the lower
>>>>> stage, then this is the root bug to fix.
>>>> I haven't studied the I-pipe trace /wrt this in details yet, but I could
>>>> imagine that some shadow task is interrupted in primary mode by the
>>>> timer IRQ and then leaves the handler in secondary mode due to whatever
>>>> events between schedule-out and in at the end of xnintr_clock_handler.
>>>>
>>> You need a thread context to move to secondary, I just can't see how
>>> such scenario would be possible.
>> Here is the trace of events:
>>
>> => Shadow task starts migration to secondary
>> => in xnpod_suspend_thread, nklock is briefly released before
>>    xnpod_schedule
> 
> Which is the root bug. Blame on me; this recent change in -head breaks a
> basic rule a lot of code is based on: a self-suspending thread may not
> be preempted while scheduling out, i.e. suspension and rescheduling must
> be atomically performed. xnshadow_relax() counts on this too.


Actually, I think the idea was mine in the first place... Maybe we can
specify a special flag to xnpod_suspend_thread to ask fo the atomic
suspension (maybe reuse XNATOMIC ?).

-- 
                                                 Gilles.

_______________________________________________
Xenomai-core mailing list
Xenomai-core@gna.org
https://mail.gna.org/listinfo/xenomai-core

Re: [Xenomai-core] [PATCH] Fix host IRQ propagation

Reply via email to