On Fri, 2019-01-11 at 10:40 +0100, Henning Schild wrote:
> Am Fri, 11 Jan 2019 09:57:50 +0100
> schrieb Mauro Salvini via Xenomai <xenomai@xenomai.org>:
> 
> > Hi all,
> > 
> > I'm testing same hardware of [1], with kernel 4.9.146 from ipipe-
> > 4.9.y
> > with [2] applied, compiled with ARCH=i386 and Xenomai 3.0.7.
> 
> To be honest i386 is not really tested anymore, in fact in 4.14 not
> even supported at the moment. If you can you should go for x86_64.
> 

Hi Henning,

Thank you. I'm trying i386 version due to legacy 32bit code that uses
rtnet (which cannot be used with mixed ABI).

> > Launching
> > 
> > xeno-test -l "dohell -s xxx -p yyy -m xxx 90000" -T 90000
> > 
> > I got this dump in dmesg, sometimes just after latency starts,
> > sometimes after few seconds (side effect is a max latency value
> > increase):
> > 
> > [  167.914184] ------------[ cut here ]------------
> > [  167.914208] WARNING: CPU: 0 PID: 606
> > at /home/build-ws/develop/linux-
> > 4.9.146/arch/x86/include/asm/fpu/internal.h:511
> > fpu__restore+0x1eb/0x2b0 [  167.914216] Modules linked in:
> > intel_rapl
> > intel_powerclamp iTCO_wdt iTCO_vendor_support coretemp kvm_intel
> > kvm
> > irqbypass crc32_pclmul aesni_intel xts aes_i586 lrw gf128mul
> > ablk_helper cryptd snd_pcm intel_cstate snd_timer evdev snd
> > soundcore
> > i915 pcspkr drm_kms_helper drm fb_sys_fops syscopyarea sysfillrect
> > sysimgblt shpchp video lpc_ich mfd_core button ip_tables x_tables
> > autofs4 ext4 crc16 jbd2 fscrypto mbcache hid_generic usbhid hid
> > mmc_block crc32c_intel i2c_i801 i2c_smbus igb i2c_algo_bit xhci_pci
> > ptp pps_core xhci_hcd sdhci_pci sdhci usbcore mmc_core fjes [last
> > unloaded: rtnet] [  167.914768] CPU: 0 PID: 606 Comm: dohell Not
> > tainted 4.9.146+ #1 [  167.914772] Hardware name: Default string
> > Default string/Q7-BW, BIOS V1.20#KW050220A 03/16/2018
> > [  167.914775]
> > I-pipe domain: Linux [  167.914778]  f42e5e44 daeffa2d 00000000
> > db335030 dac1ff3b f42e5e74 dac59dea db34504c
> > [  167.914800]  00000000
> > 0000025e db335030 000001ff dac1ff3b 000001ff f4291bc0 00000246
> > [  167.914822]  f4291c00 f42e5e88 dac59efb 00000009 00000000
> > 00000000
> > f42e5ea4 dac1ff3b [  167.914843] Call Trace:
> > [  167.914846]  [<daeffa2d>] dump_stack+0x9f/0xc2
> > [  167.914849]  [<dac1ff3b>] ? fpu__restore+0x1eb/0x2b0
> > [  167.914865]  [<dac59dea>] __warn+0xea/0x110
> > [  167.914868]  [<dac1ff3b>] ? fpu__restore+0x1eb/0x2b0
> > [  167.914871]  [<dac59efb>] warn_slowpath_null+0x2b/0x30
> > [  167.914874]  [<dac1ff3b>] fpu__restore+0x1eb/0x2b0
> > [  167.914877]  [<dac21b0a>] __fpu__restore_sig+0x2ba/0x680
> > [  167.914879]  [<dac22141>] fpu__restore_sig+0x31/0x50
> > [  167.914882]  [<dac13f52>] restore_sigcontext.isra.9+0xf2/0x110
> > [  167.914885]  [<dac149b9>] sys_sigreturn+0xa9/0xc0
> > [  167.914888]  [<dac019f5>] do_int80_syscall_32+0x85/0x190
> > [  167.914891]  [<db1a56d5>] entry_INT80_32+0x31/0x31
> > [  167.914898]
> > ---[ end trace e57344f10f300a76 ]---
> 
> I am not sure which path leads you there. But it could well be a
> state
> that was caused by the ipipe patch.
> 
> could you try this:
> 
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -426,6 +426,10 @@ void fpu__restore(struct fpu *fpu)
>         /* Avoid __kernel_fpu_begin() right after fpregs_activate()
> */
>         kernel_fpu_disable();
>         trace_x86_fpu_before_restore(fpu);
> +       if (fpregs_activate(fpu)) {

This instruction does not compile due to fpregs_activate() returns
void, perhaps did you mean "if (fpregs_active(fpu))"?
Given that fpregs_active() have no args, I tried with this:

if (fpu->fpregs_active)

and warning does not raise (even warning added with this patch).

> +               WARN_ON_FPU(fpu !=
> this_cpu_read_stable(fpu_fpregs_owner_ctx));
> +               fpregs_deactivate(fpu);
> +       }
>         fpregs_activate(fpu);
>         copy_kernel_to_fpregs(&fpu->state);
>         trace_x86_fpu_after_restore(fpu);
> 
> This would not be a proper fix, especially if you end up seeing that
> warning ...
> 
> Henning
> 
> > I found discussion at [3], and applied patch at [4] that comes from
> > it, but result is the same.
> > 
> > Starting xeno-test without -l argument result is the same.
> > Launching dohell alone (with same arguments as when launched from
> > xeno- test -l), dump does not appear.
> > 
> > Could be a Xenomai-related problem (though the stack seems not
> > concern
> > Xenomai) or it is better to post it on LKML?
> > 
> > Thanks in advance, regards
> > 
> > Mauro
> > 
> > [1] https://xenomai.org/pipermail/xenomai/2018-December/040142.html
> > [2] https://xenomai.org/pipermail/xenomai/2019-January/040172.html
> > [3]
> > https://lore.kernel.org/lkml/20181120102635.ddv3fvavxajjlfqk@linutr
> > onix.de/ [4]
> > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/co
> > mmit/?h=linux-4.9.y&id=d3741e0390287056011950493a641524f49fa05a
> > 
> > 
> 
> 

Reply via email to