On 04/11/2018 11:04 AM, Razvan Cojocaru wrote: >> After much debugging, it turns out that the >> "p2m_is_ram(p2mt)" test in hvm_hap_nested_page_fault() fails if I switch >> to the new altp2m view fast enough, and that in turn disables the >> logdirty processing gated on it > > Actually as it turns out the exit doesn't happen at all anymore so > hvm_hap_nested_page_fault() doesn't get called (I've added a printk() in > hvm_hap_nested_page_fault() just before "/* Check access permissions > first, then handle faults */" and it doesn't appear).
This is what seems to be happening, the following call traces end up calling p2m_altp2m_propagate_change() after switching to the new altp2m view early: (XEN) Xen call trace: (XEN) [<ffff82d08032ee1a>] p2m_altp2m_propagate_change+0x4e/0x508 (XEN) [<ffff82d0803341b7>] p2m-ept.c#ept_set_entry+0x7e4/0x8c4 (XEN) [<ffff82d08032801b>] p2m_set_entry+0xe2/0x124 (XEN) [<ffff82d080328250>] p2m.c#p2m_remove_page+0x1f3/0x209 (XEN) [<ffff82d0803292df>] guest_physmap_remove_page+0x18c/0x214 (XEN) [<ffff82d080221bc5>] guest_remove_page+0x27b/0x2d3 (XEN) [<ffff82d08022232a>] do_memory_op+0x502/0x22f8 (XEN) [<ffff82d08036d876>] pv_hypercall+0x1f4/0x440 (XEN) [<ffff82d080374495>] lstar_enter+0x115/0x120 and (XEN) Xen call trace: (XEN) [<ffff82d08032ee1a>] p2m_altp2m_propagate_change+0x4e/0x508 (XEN) [<ffff82d0803341b7>] p2m-ept.c#ept_set_entry+0x7e4/0x8c4 (XEN) [<ffff82d08032801b>] p2m_set_entry+0xe2/0x124 (XEN) [<ffff82d080329b2a>] guest_physmap_add_entry+0x7c3/0xacd (XEN) [<ffff82d080222720>] do_memory_op+0x8f8/0x22f8 (XEN) [<ffff82d08036d876>] pv_hypercall+0x1f4/0x440 (XEN) [<ffff82d080374495>] lstar_enter+0x115/0x120 So clearly it's the external pages described earlier by George and Alexey landing. p2m_altp2m_propagate_change() then proceeds to do nothing (because gfn > p2m->max_remapped_gfn, _and_ m == INVALID_MFN). Next, in hvm_hap_nested_page_fault() p2m_altp2m_lazy_copy() returns 1, which leads to the function just exiting with no further logdirty checks (there's a goto out; that jumps them), and then that's it, I don't see any more page faults for those gfns. I've tried an assorted array of strategies: change p2m_altp2m_propagate_change() to always call set_entry() on the altp2m view, doing set_entry() for every gfn in the hostp2m into the new altp2m view in p2m_init_altp2m_ept() (a combination of these), trying to run the logdirty code in hvm_hap_nested_page_fault() before the goto out corresponding to p2m_altp2m_lazy_copy(). None of these things has worked. Thanks, Razvan _______________________________________________ Xen-devel mailing list Xenemail@example.com https://lists.xenproject.org/mailman/listinfo/xen-devel