On 06/18/2011 03:58 PM, Jan Kiszka wrote:
> On 2011-06-18 15:12, Gilles Chanteperdrix wrote:
>> On 06/18/2011 03:07 PM, Jan Kiszka wrote:
>>> On 2011-06-18 14:56, Gilles Chanteperdrix wrote:
>>>> On 06/18/2011 02:10 PM, Jan Kiszka wrote:
>>>>> On 2011-06-18 14:09, Gilles Chanteperdrix wrote:
>>>>>> On 06/18/2011 12:21 PM, Jan Kiszka wrote:
>>>>>>> On 2011-06-17 20:55, Gilles Chanteperdrix wrote:
>>>>>>>> On 06/17/2011 07:03 PM, Jan Kiszka wrote:
>>>>>>>>> On 2011-06-17 18:53, Gilles Chanteperdrix wrote:
>>>>>>>>>> On 06/17/2011 04:38 PM, GIT version control wrote:
>>>>>>>>>>> Module: xenomai-jki
>>>>>>>>>>> Branch: for-upstream
>>>>>>>>>>> Commit: 7203b1a66ca0825d5bcda1c3abab9ca048177914
>>>>>>>>>>> URL:    
>>>>>>>>>>> http://git.xenomai.org/?p=xenomai-jki.git;a=commit;h=7203b1a66ca0825d5bcda1c3abab9ca048177914
>>>>>>>>>>> Author: Jan Kiszka <jan.kis...@siemens.com>
>>>>>>>>>>> Date:   Fri Jun 17 09:46:19 2011 +0200
>>>>>>>>>>> nucleus: Fix interrupt handler tails
>>>>>>>>>>> Our current interrupt handlers assume that they leave over the same 
>>>>>>>>>>> task
>>>>>>>>>>> and CPU they entered. But commit f6af9b831c broke this assumption:
>>>>>>>>>>> xnpod_schedule invoked from the handler tail can now actually 
>>>>>>>>>>> trigger a
>>>>>>>>>>> domain migration, and that can also include a CPU migration. This 
>>>>>>>>>>> causes
>>>>>>>>>>> subtle corruptions as invalid xnstat_exectime_t objects may be 
>>>>>>>>>>> restored
>>>>>>>>>>> and - even worse - we may improperly flush XNHTICK of the old CPU,
>>>>>>>>>>> leaving Linux timer-wise dead there (as happened to us).
>>>>>>>>>>> Fix this by moving XNHTICK replay and exectime accounting before the
>>>>>>>>>>> scheduling point. Note that this introduces a tiny imprecision in 
>>>>>>>>>>> the
>>>>>>>>>>> accounting.
>>>>>>>>>> I am not sure I understand why moving the XNHTICK replay is needed: 
>>>>>>>>>> if
>>>>>>>>>> we switch to secondary mode, the HTICK is handled by xnpod_schedule
>>>>>>>>>> anyway, or am I missing something?
>>>>>>>>> The replay can work on an invalid sched (after CPU migration in
>>>>>>>>> secondary mode). We could reload the sched, but just moving the replay
>>>>>>>>> is simpler.
>>>>>>>> But does it not remove the purpose of this delayed replay?
>>>>>>> Hmm, yes, in the corner case of coalesced timed RT task wakeup and host
>>>>>>> tick over a root thread. Well, then we actually have to reload sched and
>>>>>>> keep the ordering to catch that as well.
>>>>>>>> Note that if you want to reload the sched, you also have to shut
>>>>>>>> interrupts off, because upon return from xnpod_schedule after 
>>>>>>>> migration,
>>>>>>>> interrupts are on.
>>>>>>> That would be another severe bug if we left an interrupt handler with
>>>>>>> hard IRQs enabled - the interrupt tail code of ipipe would break.
>>>>>>> Fortunately, only xnpod_suspend_thread re-enables IRQs and returns.
>>>>>>> xnpod_schedule also re-enables but then terminates the context (in
>>>>>>> xnshadow_exit). So we are safe.
>>>>>> I do not think we are, at least on platforms where context switches
>>>>>> happen with irqs on.
>>>>> Can you sketch a problematic path?
>>>> On platforms with IPIPE_WANT_PREEMPTIBLE_SWITCH on, all context switches
>>>> happens with irqs on. So, in particular, the context switch to a relaxed
>>>> task happens with irqs on. In __xnpod_schedule, we then return from
>>>> xnpod_switch_to with irqs on, and so return from __xnpod_schedule with
>>>> irqs on.
>>> "/* We are returning to xnshadow_relax via
>>>     xnpod_suspend_thread, do nothing,
>>>     xnpod_suspend_thread will re-enable interrupts. */"
>>> Looks like this is outdated. I think we best fix this in
>>> __xnpod_schedule by disabling irqs there instead of forcing otherwise
>>> redundant disabling into all handler return paths.
>> I agree.
> I've queued a corresponding patch, and also one to clean up that special
> handshake between xnshadow_relax and xnpod_suspend_thread a bit.
> Consider both as RFC.

Merged your whole branch, but took the liberty to change it a bit
(replacing the commit concerning unlocked context switches with comments
changes only, and changing the commit about xntbase_tick).


Xenomai-core mailing list

Reply via email to