On 18.08.2010 10:27, Philippe Gerum wrote:
> On Tue, 2010-08-17 at 19:43 +0200, Stefan Kisdaroczi wrote:
>
>> On 17.08.2010 12:27, Philippe Gerum wrote:
>>
>>> On Mon, 2010-08-16 at 21:14 +0200, Theo Veenker wrote:
>>>
>>>
>>>> On 08/16/2010 04:26 PM, Theo Veenker wrote:
>>>>
>>>>
>>>>> Gilles Chanteperdrix wrote:
>>>>>
>>>>>
>>>>>> Theo Veenker wrote:
>>>>>>
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I want to upgrade all our PC's from Ubuntu hardy to lucid and in the
>>>>>>> process
>>>>>>> I'm also going from kernel 2.6.29.5 with Xenomai 2.4.8 to kernel
>>>>>>> 2.6.32.11
>>>>>>> with Xenomai 2.5.3.
>>>>>>>
>>>>>>> I first built and tested the 2.6.32.11 kernel with 2.5.3 on my hardy
>>>>>>> system
>>>>>>> and all went fine. But the problem is it just doesn't run on the
>>>>>>> lucid distro.
>>>>>>>
>>>>>>>
>>>>>> This, I do not understand, the kernel does not need any support from the
>>>>>> distribution for booting, how can the same kernel boot with one
>>>>>> distribution, and not with the other? When you say the "same kernel", do
>>>>>> you mean the exact same zImage or bzImage, or do you mean the kernel
>>>>>> with the same configuration, but with a different compiler, or only the
>>>>>> version is identical?
>>>>>>
>>>>>>
>>>>>>
>>>>> It is a complete mystery to me either. I compiled my kernel into a deb
>>>>> package
>>>>> and installed the very same deb package on three machines:
>>>>> MSI p45 neo3 with Hardy on it -> works OK
>>>>> MSI p45 neo3 with Ludid on it -> nothing (works fine with regular kernel)
>>>>> MSI 945P with Lucid on it: -> nothing (works fine with regular kernel)
>>>>>
>>>>> I'll try the suggestions posted and keep you informed.
>>>>>
>>>>>
>>>> OK. Connected a terminal to catch early kernel messages. Still no output
>>>> unfortunately (with the regular kernel I do get output on the terminal,
>>>> so the connection works).
>>>>
>>>> Meanwhile also built and tested kernel 2.6.32.15 + xenomai 2.5.4. Still
>>>> nothing.
>>>> I'm clueless. I'm running Xenomai for years on dozens of systems and I've
>>>> never run into problems like this. I think I'll have to sit down and take a
>>>> close look at what I'm doing. I've always built my kernels using make-kpkg,
>>>> maybe that somehow introduces a problem here. I'll try without it.
>>>>
>>>> (unfortunately/luckily I have to work from home for a few days so I can't
>>>> get to the test system until later this week)
>>>>
>>>>
>>> I failed to reproduce the issue yet, but it very much looks like an
>>> I-pipe bug. Could you try the following config variants when time
>>> allows:
>>>
>>>
>> I installed the kernel (2.6.32.15 2.5.4 x86 32bit) which is working on
>> my laptop in a kvm machine.
>> In the virtual machine the kernel never starts and hangs.
>> I attached gdb to kvm and according to the cpu registers and system.map
>> it hangs in 'doublefault_fn'. As I'm not really familiar with gdb i'm
>> thankful if someone has a hint how to proceed. Thanks
>>
> If you could ask for a backtrace ("bt" command) in gdb once attached to
> the hanged kernel, and post the output there, that would be great.
> hi philippe, hope this helps: (gdb) bt #0 doublefault_fn () at arch/x86/kernel/doublefault_32.c:47 #1 0x00000000 in ?? () I set two breakpoints: 1) do_test_wp_bit() 2) zap_low_mappings() The second breakpoint is never reached, the fault seems to happen in do_test_wp_bit(). arch/x86/mm/init_32.c : mem_init() -> test_wp_bit() -> do_test_wp_bit() Breakpoint 1, do_test_wp_bit () at arch/x86/mm/init_32.c:981 981 __asm__ __volatile__( (gdb) info registers eax 0xffdff000 -2101248 ecx 0x7fc 2044 edx 0x13e8025 20873253 ebx 0xff7fe000 -8396800 esp 0xc1345fc0 0xc1345fc0 ebp 0x3830 0x3830 esi 0x160 352 edi 0x48d 1165 eip 0xc101a308 0xc101a308 <do_test_wp_bit> eflags 0x2 [ ] cs 0x60 96 ss 0x68 104 ds 0x7b 123 es 0x7b 123 fs 0xd8 216 gs 0x0 0 > Meanwhile, I tried to reproduce the issue in kvm with no luck so far. > Aside of timing issues making the boot over kvm quite shaky and most of > the time impossible with the APIC enabled, using a legacy 8254 mode > boots but never hangs. Pure emulation with -no-kvm or enabling kvm on > the host does not make a difference. I've been trying with a 32bit guest > over a 64bit host, and both host and guest in 32bit mode to no avail so > far (QEMU PC emulator version 0.12.3 (qemu-kvm-0.12.3)). > > I had a bit more luck on real hw though; a m65 Dell workstation (core2 > duo) seems to be kind enough to break during early boot. The failure > ratio is variable, but 1 crash over 3-5 boots is common; sometimes it > even crashes several times in a row. The bad news is that no rs232 is > available from this machine, and the crash happens way to early to count > on any usb<->serial converter to get any debug output; so this is going > to take some time to nail down the bug on this hw. I don't expect > netconsole to help me in any way either, for the same reason. Here are > some more information I could get though: > > - CONFIG_SMP, CONFIG_*_APIC/IO_APIC do not make any difference. I still > have a kernel crashing against the wall in plain, basic uniprocessor > mode (i.e. 8254 legacy IRQ and timing). > > - The very same kernel image does not break when booted via tftp here. > It really seems to need a boot of the kernel image from the hard drive > to get the issue. However, having the rootfs over NFS or on the hdd does > not seem to make any difference. This could be the sign of a mishandled > early access fault, which would be confirmed by your trace showing that > the double fault handler is called. > > - CONFIG_IPIPE introduces the issue alone; no need for CONFIG_XENOMAI. > > Since you are lucky enough to reproduce the bug over kvm, could you > confirm my findings on your setup? i.e. that CONFIG_SMP, CONFIG_*APIC* > and CONFIG_XENOMAI are not involved in this? > > PS: At this point, I think this bug only occurs in 32bit mode, but this > has to be verified. > > TIA, > >
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Xenomai-help mailing list [email protected] https://mail.gna.org/listinfo/xenomai-help
