Hi Jan,

On 14/05/2024 15:51, Jan Beulich wrote:
On 13.05.2024 15:40, Elias El Yandouzi wrote:
From: Hongyan Xia <hongy...@amazon.com>

Create a per-domain mapping of PV guest_root_pt as direct map is being
removed.

Note that we do not map and unmap root_pgt for now since it is still a
xenheap page.

Signed-off-by: Hongyan Xia <hongy...@amazon.com>
Signed-off-by: Julien Grall <jgr...@amazon.com>
Signed-off-by: Elias El Yandouzi <elias...@amazon.com>

----
     Changes in V3:
         * Rename SHADOW_ROOT
         * Haven't addressed the potentially over-allocation issue as I don't 
get it

I thought I had explained in enough detail that the GDT/LDT area needs
quite a bit more space (2 times 64k per vCPU) than the root PT one (4k
per vCPU). Thus while d->arch.pv.gdt_ldt_l1tab really needs to point at
a full page (as long as not taking into account dynamic domain
properties), d->arch.pv.root_pt_l1tab doesn't need to (and hence might
better be allocated using xzalloc() / xzalloc_array(), even when also
not taking into account dynamic domain properties, i.e. vCPU count).

I just understood your point and yes you're correct I was over-allocating... Sorry, it took me so long to get it.

I'll go instead with:

@@ -371,6 +396,12 @@ int pv_domain_initialise(struct domain *d)
         goto fail;
     clear_page(d->arch.pv.gdt_ldt_l1tab);

+    d->arch.pv.root_pt_l1tab =
+        xzalloc_array(l1_pgentry_t *,
+                      DIV_ROUND_UP(d->max_vcpus, L1_PAGETABLE_ENTRIES));
+    if ( !d->arch.pv.root_pt_l1tab )
+        goto fail;
+
     if ( levelling_caps & ~LCAP_faulting &&
          (d->arch.pv.cpuidmasks = xmemdup(&cpuidmask_defaults)) == NULL )
         goto fail;

However, I noticed quite a weird bug while doing some testing. I may need your expertise to find the root cause.

In the case where I have more vCPUs than pCPUs (and let's consider we have one pCPU for two vCPUs), I noticed that I would always get a page fault in dom0 kernel (5.10.0-13-amd64) at the exact same location. I did a bit of investigation but I couldn't come to a clear conclusion. Looking at the stack trace [1], I have the feeling the crash occurs in a loop or a recursive call.

I tried to identify where the crash occurred using addr2line:

> addr2line -e vmlinux-5.10.0-29-amd64 0xffffffff810218a0
debian/build/build_amd64_none_amd64/arch/x86/xen/mmu_pv.c:880

It turns out to point on the closing bracket of the function xen_mm_unpin_all()[2].

I thought the crash could happen while returning from the function in the assembly epilogue but the output of objdump doesn't even show the address.

The only theory I could think of was that because we only have one pCPU, we may never execute one of the two vCPUs, and never setup the mapping to the guest_root_pt in write_ptbase(), hence the page fault. This is just a random theory, I couldn't find any hint suggesting it would be the case though. Any idea how I could debug this?

[1] https://pastebin.com/UaGRaV6a
[2] https://github.com/torvalds/linux/blob/v5.10/arch/x86/xen/mmu_pv.c#L880

Elias

Reply via email to