These calls are almost all Linux kernel calls. Bug(s) in their kernel
PV drivers perhaps.
James
On May 29, 2008, at 9:41 AM, Eric Sproul wrote:
Hi,
After recently installing a CentOS 5 domU (PV) on an snv_89 dom0,
the guest
seems rather unstable. Yesterday I got two core dumps in /var/xen/
dump, and
noticed in the guest's console a couple of soft lockups:
BUG: soft lockup detected on CPU#1!
Call Trace:
<IRQ> [<ffffffff802aae32>] softlockup_tick+0xd5/0xe7
[<ffffffff8026cb4a>] timer_interrupt+0x396/0x3f2
[<ffffffff80210afe>] handle_IRQ_event+0x2d/0x60
[<ffffffff802ab1ba>] __do_IRQ+0xa4/0x105
[<ffffffff80288753>] _local_bh_enable+0x61/0xc5
[<ffffffff8026a90e>] do_IRQ+0xe7/0xf5
[<ffffffff80396a89>] evtchn_do_upcall+0x86/0xe0
[<ffffffff8025d8ce>] do_hypervisor_callback+0x1e/0x2c
<EOI> [<ffffffff802619bd>] .text.lock.spinlock+0x2/0x30
[<ffffffff8044936f>] inet6_hash_connect+0xcb/0x2ea
[<ffffffff88115fb6>] :ipv6:tcp_v6_connect+0x530/0x6f6
[<ffffffff802335d9>] lock_sock+0xa7/0xb2
[<ffffffff80258914>] inet_stream_connect+0x94/0x236
[<ffffffff8020ab49>] kmem_cache_alloc+0x62/0x6d
[<ffffffff8020ab49>] kmem_cache_alloc+0x62/0x6d
[<ffffffff803f6405>] sys_connect+0x7e/0xae
[<ffffffff802a84a6>] audit_syscall_entry+0x14d/0x180
[<ffffffff8025d2f1>] tracesys+0xa7/0xb2
BUG: soft lockup detected on CPU#0!
Call Trace:
<IRQ> [<ffffffff802aae32>] softlockup_tick+0xd5/0xe7
[<ffffffff8026cb4a>] timer_interrupt+0x396/0x3f2
[<ffffffff80210afe>] handle_IRQ_event+0x2d/0x60
[<ffffffff802ab1ba>] __do_IRQ+0xa4/0x105
[<ffffffff8026a90e>] do_IRQ+0xe7/0xf5
[<ffffffff80396a89>] evtchn_do_upcall+0x86/0xe0
[<ffffffff8025d8ce>] do_hypervisor_callback+0x1e/0x2c
<EOI> [<ffffffff802619bd>] .text.lock.spinlock+0x2/0x30
[<ffffffff8041e10e>] inet_hash_connect+0xc8/0x41c
[<ffffffff80427780>] tcp_v4_connect+0x372/0x69f
[<ffffffff80230882>] sock_recvmsg+0x101/0x120
[<ffffffff88115c4a>] :ipv6:tcp_v6_connect+0x1c4/0x6f6
[<ffffffff80219c31>] vsnprintf+0x559/0x59e
[<ffffffff802335d9>] lock_sock+0xa7/0xb2
[<ffffffff8025b5fe>] cache_alloc_refill+0x13c/0x4ba
[<ffffffff80258914>] inet_stream_connect+0x94/0x236
[<ffffffff8020ab49>] kmem_cache_alloc+0x62/0x6d
[<ffffffff803f6405>] sys_connect+0x7e/0xae
[<ffffffff802a84a6>] audit_syscall_entry+0x14d/0x180
[<ffffffff8025d2f1>] tracesys+0xa7/0xb2
I've been googling around for answers, but the Redhat bug most
frequently linked
seems to relate to live migration, which I've not done. This guest
was
installed directly using virt-install.
Right now I cannot get into my guest-- it's not responding on the
network or the
console. A 'virsh shutdown' looks like it worked, but the console
remains
unresponsive. It looks like it's just spinning, based on the Time
value in 'xm
list':
# xm list
Name ID Mem VCPUs
State Time(s)
Domain-0 0 12051 8
r----- 1497.0
zimbra 5 4096 2
r----- 67931.2
The last time this happened I had to reboot the whole server, which
seems
drastic. Is there a better way to regain control over the guest?
I also need to figure out how to fix the soft lockups. I'm running
the latest
available mainline CentOS kernel via yum update. My research so far
seems to
indicate that this occurs when an IRQ takes too long to respond.
Maybe I need
to pin the guest to particular CPUs, instead of letting dom0
dynamically assign
them? Any advice in this area would be appreciated.
Thanks,
Eric
_______________________________________________
xen-discuss mailing list
[email protected]
_______________________________________________
xen-discuss mailing list
[email protected]