'xm restore' immediately following boot usually wedges the cpu.
However, xm save followed by xm restore works fine (even when
guest domain and htab are relocated to new memory areas).

^AAA shows:  with .plpar_hcall_norets  @ c00000000003af78
             and  .HYPERVISOR_sched_op @ c00000000004415c
(XEN) *** Dumping CPU3 state: ***
(XEN) ----[ Xen-3.0-unstable     ]----
(XEN) CPU: 00000003   DOMID: 00000001
(XEN) pc c00000000003af88 msr 8000000000009032
(XEN) lr c000000000044210 ctr c000000000044238
(XEN) srr0 ffffffffffffffff srr1 ffffffffffffffff
(XEN) r00: 0000000024555548 c00000000065bcb0 c000000000656630 0000000000000000
(XEN) r04: 0000000000000001 0000000000000000 0000000024555542 c00000000000fc24
(XEN) r08: 00000000ecf515a8 c000000000044238 0000000000989680 c0000000000441a4
(XEN) r12: 0000000001a9f9f8 c00000000052e300 5555555555555555 5555555555555555
(XEN) r16: 5555555555555555 5555555555555555 5555555555555555 5555555555555555
(XEN) r20: 5555555555555555 5555555555555555 5555555555555555 5555555555555555
(XEN) r24: 5555555555555555 5555555555555555 4000000000000000 c000000000000000
(XEN) r28: 0000000000000000 0000000000000010 c00000000053d3c8 0000000000000001
(XEN) reprogram_timer[00] Timeout in the past 0x0000004332DBA479 > 
0x00000042C2424DF3


Here are typical console with debug prints and execptions:
If 'xm restore' is run several times, often it will start working,
though the exceptions still occur... (user domain has ramdisk & networking)
At the bottom, some code specified by a couple Exceptions...


1. 'xm restore' following xm save:

cso84:~ # xm console 6
mfdec: -12
TIMEBASE_FREQ: 71592390
Here we're resuming 
hid4: 0x6200120000000042
arch_gnttab_map: grant table at d000080080000000
irq_resume() 
switch_idle_mm()
mfdec: 14315899
__sti()
xencons_resume() 
xenbus_resume()
smp_resume()
mfdec: 63024
returning
netfront: device eth0 has copying receive path.

[EMAIL PROTECTED] /]# 


2. reboot with 'xm restore' that worked 1st time:

cso84:~ # xm console 1
mfdec: -14
TIMEBASE_FREQ: 71592390
Here we're resuming 
hid4: 0x6000120000000041
arch_gnttab_map: grant table at d000080080000000
irq_resume() 
switch_idle_mm()
mfdec: 14315924
__sti()
xencons_resume() 
xenbus_resume()
BUG: soft lockup detected on CPU#0!
Call Trace:
[C00000000065B090] [C00000000001062C] .show_stack+0x50/0x1cc (unreliable)
[C00000000065B140] [C00000000008956C] .softlockup_tick+0x100/0x128
[C00000000065B200] [C000000000065BC0] .run_local_timers+0x1c/0x30
[C00000000065B280] [C000000000023C60] .timer_interrupt+0x108/0x4f0
[C00000000065B3B0] [C0000000000034EC] decrementer_common+0xec/0x100
--- Exception: 901 at .handle_IRQ_event+0x4c/0x13c
    LR = .__do_IRQ+0x1ac/0x2b4
[C00000000065B6A0] [C0000000005AB7B0] 0xc0000000005ab7b0 (unreliable)
[C00000000065B740] [C000000000089FC8] .__do_IRQ+0x1ac/0x2b4
[C00000000065B800] [C0000000002B7134] .evtchn_do_upcall+0x128/0x1a4
[C00000000065B8C0] [C000000000043664] .xen_get_irq+0x10/0x28
[C00000000065B940] [C00000000000BD7C] .do_IRQ+0x7c/0x100
[C00000000065B9C0] [C0000000000041EC] hardware_interrupt_entry+0xc/0x10
--- Exception: 501 at .plpar_hcall_norets+0x10/0x1c
    LR = .HYPERVISOR_sched_op+0xb4/0x10c
[C00000000065BCB0] [C0000000000BDA74] .kmem_cache_free+0xe4/0x2f4 (unreliable)
[C00000000065BD60] [C0000000000455CC] .xen_power_save+0x80/0x98
[C00000000065BDE0] [C0000000000120E4] .cpu_idle+0x14c/0x154
[C00000000065BE70] [C000000000009174] .rest_init+0x44/0x5c
[C00000000065BEF0] [C0000000004E58D8] .start_kernel+0x2a0/0x308
[C00000000065BF90] [C0000000000084FC] .start_here_common+0x50/0x54
smp_resume()
mfdec: 90178
returning
netfront: device eth0 has copying receive path.

[EMAIL PROTECTED] /]# 


3. reboot with typical wedge:

cso84:~ # xm console 1
mfdec: -12
TIMEBASE_FREQ: 71592390
Here we're resuming 
hid4: 0x6000120000000041
arch_gnttab_map: grant table at d000080080000000
irq_resume() 
switch_idle_mm()
mfdec: 14315903
__sti()
xencons_resume() 
xenbus_resume()
smp_resume()
mfdec: 14218880
returning
BUG: soft lockup detected on CPU#0!
Call Trace:
[C00000000065B090] [C00000000001062C] .show_stack+0x50/0x1cc (unreliable)
[C00000000065B140] [C00000000008956C] .softlockup_tick+0x100/0x128
[C00000000065B200] [C000000000065BC0] .run_local_timers+0x1c/0x30
[C00000000065B280] [C000000000023C60] .timer_interrupt+0x108/0x4f0
[C00000000065B3B0] [C0000000000034EC] decrementer_common+0xec/0x100
--- Exception: 901 at .handle_IRQ_event+0x4c/0x13c
    LR = .__do_IRQ+0x1ac/0x2b4
[C00000000065B6A0] [C0000000005AB7B0] 0xc0000000005ab7b0 (unreliable)
[C00000000065B740] [C000000000089FC8] .__do_IRQ+0x1ac/0x2b4
[C00000000065B800] [C0000000002B7134] .evtchn_do_upcall+0x128/0x1a4
[C00000000065B8C0] [C000000000043664] .xen_get_irq+0x10/0x28
[C00000000065B940] [C00000000000BD7C] .do_IRQ+0x7c/0x100
[C00000000065B9C0] [C0000000000041EC] hardware_interrupt_entry+0xc/0x10
--- Exception: 501 at .plpar_hcall_norets+0x10/0x1c
    LR = .HYPERVISOR_sched_op+0xb4/0x10c
[C00000000065BCB0] [C0000000000BDA74] .kmem_cache_free+0xe4/0x2f4 (unreliable)
[C00000000065BD60] [C0000000000455CC] .xen_power_save+0x80/0x98
[C00000000065BDE0] [C0000000000120E4] .cpu_idle+0x14c/0x154
[C00000000065BE70] [C000000000009174] .rest_init+0x44/0x5c
[C00000000065BEF0] [C0000000004E58D8] .start_kernel+0x2a0/0x308
[C00000000065BF90] [C0000000000084FC] .start_here_common+0x50/0x54
cso84:~ # 


4. reboot with another wedge:

cso84:~ # xm console 1
mfdec: -12
TIMEBASE_FREQ: 71592390
Here we're resuming 
hid4: 0x6000120000000041
arch_gnttab_map: grant table at d000080080000000
irq_resume() 
switch_idle_mm()
mfdec: 14315908
__sti()
xencons_resume() 
xenbus_resume()
BUG: soft lockup detected on CPU#0!
Call Trace:
[C000000001AA3650] [C00000000001062C] .show_stack+0x50/0x1cc (unreliable)
[C000000001AA3700] [C00000000008956C] .softlockup_tick+0x100/0x128
[C000000001AA37C0] [C000000000065BC0] .run_local_timers+0x1c/0x30
[C000000001AA3840] [C000000000023C60] .timer_interrupt+0x108/0x4f0
[C000000001AA3970] [C0000000000034EC] decrementer_common+0xec/0x100
--- Exception: 901 at .plpar_hcall_norets+0x10/0x1c
    LR = .HYPERVISOR_event_channel_op+0x34/0x50
[C000000001AA3C60] [C0000000000442E4] .HYPERVISOR_event_channel_op+0x1c/0x50 (un
reliable)
[C000000001AA3CF0] [C0000000002BD1F0] .xb_read+0x190/0x2ac
[C000000001AA3E30] [C0000000002BEFD4] .xenbus_thread+0x84/0x278
[C000000001AA3EE0] [C000000000074D08] .kthread+0x158/0x1a8
[C000000001AA3F90] [C000000000028310] .kernel_thread+0x4c/0x68
cso84:~ # 



Some code, for example 3:

--- Exception: 901 at .handle_IRQ_event+0x4c/0x13c : c000000000089d2c
0:mon> di c000000000089d20
c000000000089d20  7c0000a6      mfmsr   r0
c000000000089d24  60008000      ori     r0,r0,32768
c000000000089d28  7c010164      mtmsrd  r0,1
c000000000089d2c  7c7d07b4      extsw   r29,r3
c000000000089d30  48000010      b       c000000000089d40        # 
.handle_IRQ_event+0x60/0x13c
c000000000089d34  ebff0028      ld      r31,40(r31)
c000000000089d38  2fbf0000      cmpdi   cr7,r31,0
c000000000089d3c  419e005c      beq     cr7,c000000000089d98    # 
.handle_IRQ_event+0xb8/0x13c


--- Exception: 501 at .plpar_hcall_norets+0x10/0x1c : c00000000003af988
0:mon> di c00000000003af78
c00000000003af78  7c421378      mr      r2,r2
c00000000003af7c  7c000026      mfcr    r0
c00000000003af80  90010008      stw     r0,8(r1)
c00000000003af84  44000022      svca    8
c00000000003af88  80010008      lwz     r0,8(r1)
c00000000003af8c  7c0ff120      mtcr    r0
c00000000003af90  4e800020      blr

_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@lists.xensource.com
http://lists.xensource.com/xen-ppc-devel

Reply via email to