On 06.09.21 10:46, Andrew Cooper wrote:
On 06/09/2021 09:23, Juergen Gross wrote:
On 03.09.21 17:41, Bertrand Marquis wrote:
Hi,

While doing some investigation with cpupools I encountered a crash
when trying to isolate a guest to its own physical cpu.

I am using current staging status.

I did the following (on FVP with 8 cores):
- start dom0 with dom0_max_vcpus=1
- remove core 1 from dom0 cpupool: xl cpupool-cpu-remove Pool-0 1
- create a new pool: xl cpupool-create name=\"NetPool\”
- add core 1 to the pool: xl cpupool-cpu-add NetPool 1
- create a guest in NetPool using the following in the guest config:
pool=“NetPool"

I end with a crash with the following call trace during guest creation:
(XEN) Xen call trace:
(XEN)    [<0000000000234cb0>] credit2.c#csched2_alloc_udata+0x58/0xfc
(PC)
(XEN)    [<0000000000234c80>] credit2.c#csched2_alloc_udata+0x28/0xfc
(LR)
(XEN)    [<0000000000242d38>] sched_move_domain+0x144/0x6c0
(XEN)    [<000000000022dd18>]
cpupool.c#cpupool_move_domain_locked+0x38/0x70
(XEN)    [<000000000022fadc>] cpupool_do_sysctl+0x73c/0x780
(XEN)    [<000000000022d8e0>] do_sysctl+0x788/0xa58
(XEN)    [<0000000000273b68>] traps.c#do_trap_hypercall+0x78/0x170
(XEN)    [<0000000000274f70>] do_trap_guest_sync+0x138/0x618
(XEN)    [<0000000000260458>] entry.o#guest_sync_slowpath+0xa4/0xd4

After some debugging I found out that unit->vcpu_list is NULL after
unit->vcpu_list = d->vcpu[unit->unit_id]; with unit_id 0 in
common/sched/core.c:688
This makes the call to is_idle_unit(unit) in csched2_alloc_udata
trigger the crash.

So there is no vcpu 0 in that domain? How is this possible?

Easy, depending on the order of hypercalls issued by the toolstack.

Between DOMCTL_createdomain and DOMCTL_max_vcpus, the domain exists but
the vcpus haven't been allocated.

Oh yes, indeed.

Bertrand, does the attached patch fix the issue for you?


Juergen
From 82af7d22a69a8cac633a6b2a40bc7d52dac5c5e8 Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgr...@suse.com>
To: xen-devel@lists.xenproject.org
Cc: George Dunlap <george.dun...@citrix.com>
Cc: Dario Faggioli <dfaggi...@suse.com>
Date: Mon, 6 Sep 2021 11:19:12 +0200
Subject: [PATCH] xen/sched: fix sched_move_domain() for domain without vcpus

In case a domain is created with a cpupool other than Pool-0 specified
it will be moved to that cpupool before any vcpus are allocated.

This will lead to a NULL pointer dereference in sched_move_domain().

Fix that by tolerating vcpus not being allocated yet.

Fixes: 61649709421a5a7c1 ("xen/domain: Allocate d->vcpu[] in domain_create()")
Reported-by: Bertrand Marquis <bertrand.marq...@arm.com>
Signed-off-by: Juergen Gross <jgr...@suse.com>
---
 xen/common/sched/core.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index 8d178baf3d..79c9100680 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -671,6 +671,10 @@ int sched_move_domain(struct domain *d, struct cpupool *c)
 
     for ( unit_idx = 0; unit_idx < n_units; unit_idx++ )
     {
+        /* Special case for move at domain creation time. */
+        if ( !d->vcpu[unit_idx * gran] )
+            break;
+
         unit = sched_alloc_unit_mem();
         if ( unit )
         {
-- 
2.26.2

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

Reply via email to