On 2014-02-11 15:30, Philippe Gerum wrote:
> On 02/10/2014 05:40 PM, Jan Kiszka wrote:
>> On 2014-02-08 17:00, Philippe Gerum wrote:
>>> On 02/08/2014 03:44 PM, Gilles Chanteperdrix wrote:
>>>> On 02/08/2014 10:57 AM, Philippe Gerum wrote:
>>>>> there should be no point in instantiating scheduler slots for
>>>>> non-RT CPUs anymore, I agree.
>>>>
>>>> Are you sure this will not break xnintr_core_clock_handler? On some
>>>> architectures, the tick handler is called on all cpus, and simply
>>>> forwards the host tick when a cpu is not supported, but in order to do
>>>> this, it seems to use the xnsched structure.
>>>>
>>>
>>> Yes, the change is incomplete. Either we initialize the ->cpu member in
>>> all slots, including the non-RT ones, or we will need something along
>>> these lines:
>>>
>>> diff --git a/kernel/cobalt/intr.c b/kernel/cobalt/intr.c
>>> index b162d22..4758c6b 100644
>>> --- a/kernel/cobalt/intr.c
>>> +++ b/kernel/cobalt/intr.c
>>> @@ -94,10 +94,10 @@ void xnintr_host_tick(struct xnsched *sched) /*
>>> Interrupts off. */
>>> */
>>> void xnintr_core_clock_handler(void)
>>> {
>>> - struct xnsched *sched = xnsched_current();
>>> - int cpu __maybe_unused = xnsched_cpu(sched);
>>> + int cpu = ipipe_processor_id();
>>> struct xnirqstat *statp;
>>> xnstat_exectime_t *prev;
>>> + struct xnsched *sched;
>>>
>>> if (!xnsched_supported_cpu(cpu)) {
>>> #ifdef XNARCH_HOST_TICK_IRQ
>>> @@ -106,6 +106,7 @@ void xnintr_core_clock_handler(void)
>>> return;
>>> }
>>>
>>> + sched = xnsched_struct(cpu);
>>> statp = __this_cpu_ptr(nktimer.stats);
>>> prev = xnstat_exectime_switch(sched, &statp->account);
>>> xnstat_counter_inc(&statp->hits);
>>>
>>
>> There is more:
>>
>> [ 1.963540] BUG: unable to handle kernel NULL pointer dereference
>> at 0000000000000480
>> [ 1.966360] IP: [<ffffffff81123bdf>] xnshadow_private_get+0x1f/0x40
>> [ 1.967482] PGD 0
>> [ 1.970784] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
>> [ 1.970784] Modules linked in:
>> [ 1.970784] CPU: 3 PID: 53 Comm: init Not tainted 3.10.28+ #102
>> [ 1.970784] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> BIOS Bochs 01/01/2011
>> [ 1.970784] task: ffff88003b106300 ti: ffff88003ace8000 task.ti:
>> ffff88003ace8000
>> [ 1.970784] RIP: 0010:[<ffffffff81123bdf>] [<ffffffff81123bdf>]
>> xnshadow_private_get+0x1f/0x40
>> [ 1.970784] RSP: 0018:ffff88003acebb78 EFLAGS: 00010246
>> [ 1.970784] RAX: 0000000000000000 RBX: ffff88003acd78c0 RCX:
>> ffffffff81671250
>> [ 1.970784] RDX: 0000000000000000 RSI: ffffffff818504f8 RDI:
>> 0000000000000000
>> [ 1.970784] RBP: ffff88003acebb78 R08: 0000000000000000 R09:
>> 00000000002a1220
>> [ 1.970784] R10: 0000000000000003 R11: ffff88003e000000 R12:
>> ffff88003acebfd8
>> [ 1.970784] R13: ffff88003e00da00 R14: 0000000000000000 R15:
>> 0000000000000005
>> [ 1.970784] FS: 0000000000000000(0000) GS:ffff88003e000000(0000)
>> knlGS:0000000000000000
>> [ 1.970784] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [ 1.970784] CR2: 0000000000000480 CR3: 000000003ac54000 CR4:
>> 00000000000006e0
>> [ 1.970784] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [ 1.970784] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
>> 0000000000000400
>> [ 1.970784] I-pipe domain Linux
>> [ 1.970784] Stack:
>> [ 1.970784] ffff88003acebc08 ffffffff811275cf ffff88003acebba8
>> ffffffff81374edf
>> [ 1.970784] ffff88003acd78c0 0000000000000079 ffff88003acebc58
>> ffffffff8119ac39
>> [ 1.970784] 0000000000000000 0000000000000082 0000000000000246
>> 0000000000016a90
>> [ 1.970784] Call Trace:
>> [ 1.970784] [<ffffffff811275cf>] ipipe_kevent_hook+0x72f/0x10b0
>> [ 1.970784] [<ffffffff81374edf>] ? __percpu_counter_add+0x5f/0x80
>> [ 1.970784] [<ffffffff8119ac39>] ? exit_mmap+0x139/0x170
>> [ 1.970784] [<ffffffff810d8d9c>] __ipipe_notify_kevent+0x9c/0x130
>> [ 1.970784] [<ffffffff8103f4ff>] mmput+0x6f/0x120
>> [ 1.970784] [<ffffffff811cd20c>] flush_old_exec+0x32c/0x770
>> [ 1.970784] [<ffffffff8121982f>] load_elf_binary+0x31f/0x19f0
>> [ 1.970784] [<ffffffff81218d38>] ? load_script+0x18/0x260
>> [ 1.970784] [<ffffffff81177ce9>] ? put_page+0x9/0x50
>> [ 1.970784] [<ffffffff811cc052>] search_binary_handler+0x142/0x3a0
>> [ 1.970784] [<ffffffff81219510>] ? elf_map+0x120/0x120
>> [ 1.970784] [<ffffffff811cdfec>] do_execve_common+0x4dc/0x590
>> [ 1.970784] [<ffffffff811ce0d7>] do_execve+0x37/0x40
>> [ 1.970784] [<ffffffff811ce35d>] SyS_execve+0x3d/0x60
>> [ 1.970784] [<ffffffff8165b7a9>] stub_execve+0x69/0xa0
>> [ 1.970784] Code: eb d6 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5
>> e8 77 71 53 00 48 c7 c0 80 12 2a 00 65 48 03 04 25 30 cd 00 00 48 8b
>> 50 10 31 c0 <48> f7 82 80 04 00 00 00 00 05 00 75 04 c9 c3 66 90 e8 6b
>> ff ff
>> [ 1.970784] RIP [<ffffffff81123bdf>] xnshadow_private_get+0x1f/0x40
>> [ 1.970784] RSP <ffff88003acebb78>
>> [ 1.970784] CR2: 0000000000000480
>> [ 2.042762] ---[ end trace 5020365fdf0eba4b ]---
>>
>> Taken with current forge next. Seems like we try to obtain
>> xnsched_current_thread() on an unsupported CPU.
>>
>> So, which path to take? Partially initialize sched or change the places
>> that try to use it over unsupported CPUs?
>>
>
> I'll be digging this soon. This uncovers a general issue with the
> restricted rt CPU set: a relaxed thread which moved to an unsupported rt
> CPU may well trigger pipelined events on this CPU (faults, linux
> syscalls, etc).
That must not crash us but should remain an exceptional case: shadowed
threads should have a Linux affinity mask that excludes unsupported CPUs.
However we resolve it, one thing should be kept in mind: unsupported
CPUs should not take Xenomai locks shared with the RT CPUs. That's the
key point about this mask.
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
_______________________________________________
Xenomai mailing list
[email protected]
http://www.xenomai.org/mailman/listinfo/xenomai