Hi,

When we tested network between domU <-> dom0 with FTP load tools,
we hitted BUG() in hypervisor. It is always reproducible for a few minutes.
At that time, we got the following message.

vmi15.sky.yk.fujitsu.co.jp login: (XEN) Xen BUG at mm.c:1254
(XEN) FIXME: implement ia64 dump_execution_state()
(XEN) 
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Xen BUG at mm.c:1254
(XEN) ****************************************
(XEN) 
(XEN) Manual reset required ('noreboot' specified)
(XEN) machine_halt called.  spinning...

So I used the attached patch(bug_check.patch) to debug,
then I got the following messages.

(XEN) Call Trace:
(XEN)  [<f4000000040c6ce0>] show_stack+0x80/0xa0
(XEN)                                 sp=f0000002fc73f4d0 bsp=f0000002fc739700
(XEN)  [<f40000000407ec60>] domain_put_page+0x350/0x740
(XEN)                                 sp=f0000002fc73f6a0 bsp=f0000002fc739680
(XEN)  [<f400000004083d60>] replace_grant_host_mapping+0x370/0x9b0
(XEN)                                 sp=f0000002fc73f6b0 bsp=f0000002fc7395e8
(XEN)  [<f40000000401ece0>] __gnttab_unmap_common+0x320/0x6d0
(XEN)                                 sp=f0000002fc73f6c0 bsp=f0000002fc739598
(XEN)  [<f4000000040235a0>] do_grant_table_op+0x1080/0x3040
(XEN)                                 sp=f0000002fc73f6d0 bsp=f0000002fc7394a0
(XEN)  [<f400000004002e30>] fast_hypercall+0x170/0x340
(XEN)                                 sp=f0000002fc73fe00 bsp=f0000002fc7394a0
(XEN) d0 mfn=a6a1f, mpaddr=6238c000, pgfn=188e3, mfn of mpaddr=188e3
(XEN) domain_put_page: mfn=a6a1f set INVALID_M2P_ENTRY
(XEN) 
(XEN) Call Trace:
(XEN)  [<f4000000040c6ce0>] show_stack+0x80/0xa0
(XEN)                                 sp=f0000002fc73f4d0 bsp=f0000002fc739700
(XEN)  [<f40000000407ea20>] domain_put_page+0x110/0x740
(XEN)                                 sp=f0000002fc73f6a0 bsp=f0000002fc739680
(XEN)  [<f400000004083d60>] replace_grant_host_mapping+0x370/0x9b0
(XEN)                                 sp=f0000002fc73f6b0 bsp=f0000002fc7395e8
(XEN)  [<f40000000401ece0>] __gnttab_unmap_common+0x320/0x6d0
(XEN)                                 sp=f0000002fc73f6c0 bsp=f0000002fc739598
(XEN)  [<f4000000040235a0>] do_grant_table_op+0x1080/0x3040
(XEN)                                 sp=f0000002fc73f6d0 bsp=f0000002fc7394a0
(XEN)  [<f400000004002e30>] fast_hypercall+0x170/0x340
(XEN)                                 sp=f0000002fc73fe00 bsp=f0000002fc7394a0
(XEN) d0 mfn=a6a1f, mpaddr=7b910000, pgfn=ffffffffffffffff, mfn of mpaddr=1ee44
(XEN) Xen BUG at mm.c:1259
(XEN) FIXME: implement ia64 dump_execution_state()
(XEN) 
(XEN) ****************************************
(XEN) Panic on CPU 13:
(XEN) Xen BUG at mm.c:1259
(XEN) ****************************************
(XEN) 
(XEN) Manual reset required ('noreboot' specified)
(XEN) machine_halt called.  spinning...

This log show that domain_put_page is called 2 times for the same mfn.
But the mfn has different mpaddrs.
I guess the follwing case is occured:
1. at 1st time, domain_put_page is called.
2. it sets INMALID_M2P_ENTRY to the mfn.
3. at 2nd time, domain_put_page is called.
4. it hits BUG() becasuse of "get_gpfn_from_mfn(mfn) == INVALID_M2P_ENTRY"

The attached patch(avoid_to_domain_put_page_INVALID_M2P_ENTRY.patch) avoids to
domain_put_page ptes of INVALID_M2P_ENTRY.
After the patch was applied, I didn't get this issue.
But I'm not sure yet that the patch is the correct way to fix this issue.
I'll debug more, but if you have any comments, please tell me.

Best Regards,

Akio Takebe

Attachment: avoid_to_domain_put_page_INVALID_M2P_ENTRY.patch
Description: Binary data

Attachment: bug_check.patch
Description: Binary data

_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel

Reply via email to