On Tue, Nov 6, 2018 at 10:41 PM Rishi <2rushike...@gmail.com> wrote:

>
>
> On Tue, Nov 6, 2018 at 5:47 PM Wei Liu <wei.l...@citrix.com> wrote:
>
>> On Tue, Nov 06, 2018 at 03:31:31PM +0530, Rishi wrote:
>> >
>> > So after knowing the stack trace, it appears that the CPU was getting
>> stuck
>> > for xen_hypercall_xen_version
>>
>> That hypercall is used when a PV kernel (re-)enables interrupts. See
>> xen_irq_enable. The purpose is to force the kernel to switch to
>> hypervisor.
>>
>> >
>> > watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [swapper/0:0]
>> >
>> >
>> > [30569.582740] watchdog: BUG: soft lockup - CPU#0 stuck for 23s!
>> > [swapper/0:0]
>> >
>> > [30569.588186] Kernel panic - not syncing: softlockup: hung tasks
>> >
>> > [30569.591307] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G             L
>>   4.19.1
>> > #1
>> >
>> > [30569.595110] Hardware name: Xen HVM domU, BIOS 4.4.1-xs132257
>> 12/12/2016
>> >
>> > [30569.598356] Call Trace:
>> >
>> > [30569.599597]  <IRQ>
>> >
>> > [30569.600920]  dump_stack+0x5a/0x73
>> >
>> > [30569.602998]  panic+0xe8/0x249
>> >
>> > [30569.604806]  watchdog_timer_fn+0x200/0x230
>> >
>> > [30569.607029]  ? softlockup_fn+0x40/0x40
>> >
>> > [30569.609246]  __hrtimer_run_queues+0x133/0x270
>> >
>> > [30569.611712]  hrtimer_interrupt+0xfb/0x260
>> >
>> > [30569.613800]  xen_timer_interrupt+0x1b/0x30
>> >
>> > [30569.616972]  __handle_irq_event_percpu+0x69/0x1a0
>> >
>> > [30569.619831]  handle_irq_event_percpu+0x30/0x70
>> >
>> > [30569.622382]  handle_percpu_irq+0x34/0x50
>> >
>> > [30569.625048]  generic_handle_irq+0x1e/0x30
>> >
>> > [30569.627216]  __evtchn_fifo_handle_events+0x163/0x1a0
>> >
>> > [30569.629955]  __xen_evtchn_do_upcall+0x41/0x70
>> >
>> > [30569.632612]  xen_evtchn_do_upcall+0x27/0x50
>> >
>> > [30569.635136]  xen_do_hypervisor_callback+0x29/0x40
>> >
>> > [30569.638181] RIP: e030:xen_hypercall_xen_version+0xa/0x20
>>
>> What is the asm code for this RIP?
>>
>>
>> Wei.
>>
>
> The issue of crash is getting resolved with appending "noirqbalance" at
> xen command line. This way all dom0 cpus are available but not irq balanced
> at xen.
>
> Even though I'm running irqbalance service in dom0 the irqs seems to be
> not moving. <- this is dom0 perspective, I do not know yet, if it follows
> Xen irq.
>
> I tried objdump, while I have  have the function in out but there is no
> asm code of it. Its just "..."
>
> ffffffff81001220 <xen_hypercall_xen_version>:
>
>         ...
>
>
> ffffffff81001240 <xen_hypercall_console_io>:
>
>         ...
>
> All "hypercalls" appear similarly.
>

How frequent can be that hypercall/xen_irq_enable()? Like n/s or once a
while?
During my tests, the system runs stable unless I'm downloading a large
file. Files around a GB size are getting downloaded without crash, but
system crash comes when its above it. I'm using a 2.1GB file & wget to
download.

Is there a way I can simulate PV kernel (re-)enable of interrupt using a
kernel module with a controlled fashion?
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to