Hello,

maxcpus=1 still causes the spurious int, this time fully locking up.

I attached the debug/irq directory after the cause.

Some things that might be relevant:
-   the SOC would use PINCTRL_BROXTON under linux, but this is disabled (not 
fixed up for Xenomai)
-   I have the regular igb driver in use, and am unbinding the network card 
prior to binding the rt_igp driver

Regards, Norbert

> -----Original Message-----
> From: Jan Kiszka <jan.kis...@siemens.com>
> Sent: Freitag, 5. Juli 2019 15:57
> To: Lange Norbert <norbert.la...@andritz.com>; Xenomai
> (xenomai@xenomai.org) <xenomai@xenomai.org>; Philippe Gerum
> <r...@xenomai.org>
> Subject: Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp to up
>
> E-MAIL FROM A NON-ANDRITZ SOURCE: AS A SECURITY MEASURE, PLEASE
> EXERCISE CAUTION WITH E-MAIL CONTENT AND ANY LINKS OR
> ATTACHMENTS.
>
>
> On 05.07.19 13:29, Jan Kiszka wrote:
> > On 05.07.19 12:43, Lange Norbert wrote:
> >>
> >>
> >>> -----Original Message-----
> >>> From: Jan Kiszka <jan.kis...@siemens.com>
> >>> Sent: Freitag, 5. Juli 2019 09:39
> >>> To: Lange Norbert <norbert.la...@andritz.com>; Xenomai
> >>> (xenomai@xenomai.org) <xenomai@xenomai.org>; Philippe Gerum
> >>> <r...@xenomai.org>
> >>> Subject: Re: ipipe 4.19: spurious APIC interrupt when setting rt_igp
> >>> to up
> >>>
> >>> E-MAIL FROM A NON-ANDRITZ SOURCE: AS A SECURITY MEASURE,
> PLEASE
> >>> EXERCISE CAUTION WITH E-MAIL CONTENT AND ANY LINKS OR
> ATTACHMENTS.
> >>>
> >>>
> >>> On 04.07.19 12:21, Jan Kiszka wrote:
> >>>> On 04.07.19 12:15, Jan Kiszka wrote:
> >>>>> On 04.07.19 10:57, Lange Norbert via Xenomai wrote:
> >>>>>> Hello,
> >>>>>>
> >>>>>> using the rt_igb driver with the recent ipipe/kernel will result
> >>>>>> in a broken state (I assume one cpu core is “stuck”).
> >>>>>>
> >>>>>> This is a quote from Phillipe (note that I tested the plain
> >>>>>> upstream revivision below)
> >>>>>>> This happens specifically when the igb driver enables the device
> >>>>>>> at rtifconfig up only with 4.19+.
> >>>>>>> The HIPASE clock device is fine and can be enabled manually with
> >>>>>>> no
> >>> issue.
> >>>>>>> The spurious IRQ
> >>>>>>> message is only a symptom, something seems wrong with this
> >>>>>>> fairly old (rt_)igb code on recent kernels.
> >>>>>>
> >>>>>> + modprobe rtnet
> >>>>>> + modprobe rtpacket
> >>>>>> + modprobe rt_igp
> >>>>>> [  325.791715] RTnet: registered rteth0 [  325.795328] rt_igb
> >>>>>> 0000:03:00.0: Intel(R) Gigabit Ethernet Network Connection [
> >>>>>> 325.802505] rt_igb 0000:03:00.0: rteth0: (PCIe:2.5Gb/s:Width x1)
> >>>>>> 22:20:47:8d:0f:c9
> >>>>>> [  325.810103] rt_igb 0000:03:00.0: rteth0: PBA No: FFFFFF-0FF [
> >>>>>> 325.815696] rt_igb 0000:03:00.0: Using MSI-X interrupts. 1 rx
> >>>>>> queue(s), 1 tx queue(s) [  325.823638] sdhci-pci 0000:00:1b.0:
> >>>>>> SDHCI controller found [8086:5aca] (rev b)
> >>>>>>
> >>>>>> + rtifconfig rteth0 up
> >>>>>> [  326.066500] spurious APIC interrupt through vector ff on
> >>>>>> CPU#0, should never happen.
> >>>>>>
> >>>>>
> >>>>> Can you retry with https://lkml.org/lkml/2019/7/3/143 applied? It
> >>>>> should tell us the real vector number.
> >>>>>
> >>>>> I'll see in parallel if I can reproduce with rt_igb here.
> >>
> >> Applying that patch then causes the ipipe-patch to fail.
> >> Would take me some time to cleanup.
> >>
> >
> > Yes, did this yesterday, and it requires more work. But the
> > information from it is no longer essential.
> >
> >>>>
> >>>> Already succeeded, with rt_e1000e in KVM. Debugging...
> >>>>
> >>>
> >>> This addresses it on x86 for me:
> >>>
> >>> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index
> >>> 6c279e065879..d503b875f086 100644
> >>> --- a/kernel/irq/chip.c
> >>> +++ b/kernel/irq/chip.c
> >>> @@ -1099,7 +1099,8 @@ void ipipe_enable_irq(unsigned int irq)
> >>>                  ipipe_root_only();
> >>>
> >>>                  raw_spin_lock_irqsave(&desc->lock, flags);
> >>> -               if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP) {
> >>> +               if (desc->istate & IPIPE_IRQS_NEEDS_STARTUP &&
> >>> +                   !WARN_ON(irq_activate(desc))) {
> >>>                          desc->istate &= ~IPIPE_IRQS_NEEDS_STARTUP;
> >>>                          chip->irq_startup(&desc->irq_data);
> >>>                  }
> >>
> >> Problem still persists for me with that patch. I use a nfsroot (with
> >> a
> >> USB->ETH adapter so I can kick out the linux igb driver),
> >> Maybe that’s related.
> >
> > Does reducing your machine to maxcpus=1 resolve the issue? I could
> > imagine we an affinity problem on top.
> >
>
> We do have an affinity problem, will try to fix it soon, but that didn't 
> allow me
> to reproduce your issue with my patch applied.
>
> Could you turn on CONFIG_GENERIC_IRQ_DEBUGFS and grab the content of
> /sys/kernel/debug/irq? Maybe Linux considers the interrupt in question
> here as "affinity managed by kernel", and then my patch is nop. Still need to
> understand all implications of this managed mode for I-pipe.
>
> Jan
>
> --
> Siemens AG, Corporate Technology, CT RDA IOT SES-DE Corporate
> Competence Center Embedded Linux
________________________________

This message and any attachments are solely for the use of the intended 
recipients. They may contain privileged and/or confidential information or 
other information protected from disclosure. If you are not an intended 
recipient, you are hereby notified that you received this email in error and 
that any review, dissemination, distribution or copying of this email and any 
attachment is strictly prohibited. If you have received this email in error, 
please contact the sender and delete the message and any attachment from your 
system.

ANDRITZ HYDRO GmbH


Rechtsform/ Legal form: Gesellschaft mit beschränkter Haftung / Corporation

Firmensitz/ Registered seat: Wien

Firmenbuchgericht/ Court of registry: Handelsgericht Wien

Firmenbuchnummer/ Company registration: FN 61833 g

DVR: 0605077

UID-Nr.: ATU14756806


Thank You
________________________________
-------------- next part --------------
A non-text attachment was scrubbed...
Name: debugirq.tar.xz
Type: application/octet-stream
Size: 1808 bytes
Desc: debugirq.tar.xz
URL: 
<http://xenomai.org/pipermail/xenomai/attachments/20190709/8de577f3/attachment.obj>

Reply via email to