On 06.09.21 10:46, Andrew Cooper wrote:
On 06/09/2021 09:23, Juergen Gross wrote:On 03.09.21 17:41, Bertrand Marquis wrote:Hi,While doing some investigation with cpupools I encountered a crash when trying to isolate a guest to its own physical cpu. I am using current staging status. I did the following (on FVP with 8 cores): - start dom0 with dom0_max_vcpus=1 - remove core 1 from dom0 cpupool: xl cpupool-cpu-remove Pool-0 1 - create a new pool: xl cpupool-create name=\"NetPool\” - add core 1 to the pool: xl cpupool-cpu-add NetPool 1 - create a guest in NetPool using the following in the guest config: pool=“NetPool" I end with a crash with the following call trace during guest creation: (XEN) Xen call trace: (XEN) [<0000000000234cb0>] credit2.c#csched2_alloc_udata+0x58/0xfc (PC) (XEN) [<0000000000234c80>] credit2.c#csched2_alloc_udata+0x28/0xfc (LR) (XEN) [<0000000000242d38>] sched_move_domain+0x144/0x6c0 (XEN) [<000000000022dd18>] cpupool.c#cpupool_move_domain_locked+0x38/0x70 (XEN) [<000000000022fadc>] cpupool_do_sysctl+0x73c/0x780 (XEN) [<000000000022d8e0>] do_sysctl+0x788/0xa58 (XEN) [<0000000000273b68>] traps.c#do_trap_hypercall+0x78/0x170 (XEN) [<0000000000274f70>] do_trap_guest_sync+0x138/0x618 (XEN) [<0000000000260458>] entry.o#guest_sync_slowpath+0xa4/0xd4 After some debugging I found out that unit->vcpu_list is NULL after unit->vcpu_list = d->vcpu[unit->unit_id]; with unit_id 0 in common/sched/core.c:688 This makes the call to is_idle_unit(unit) in csched2_alloc_udata trigger the crash.So there is no vcpu 0 in that domain? How is this possible?Easy, depending on the order of hypercalls issued by the toolstack. Between DOMCTL_createdomain and DOMCTL_max_vcpus, the domain exists but the vcpus haven't been allocated.
Oh yes, indeed. Bertrand, does the attached patch fix the issue for you? Juergen
From 82af7d22a69a8cac633a6b2a40bc7d52dac5c5e8 Mon Sep 17 00:00:00 2001 From: Juergen Gross <jgr...@suse.com> To: xen-devel@lists.xenproject.org Cc: George Dunlap <george.dun...@citrix.com> Cc: Dario Faggioli <dfaggi...@suse.com> Date: Mon, 6 Sep 2021 11:19:12 +0200 Subject: [PATCH] xen/sched: fix sched_move_domain() for domain without vcpus In case a domain is created with a cpupool other than Pool-0 specified it will be moved to that cpupool before any vcpus are allocated. This will lead to a NULL pointer dereference in sched_move_domain(). Fix that by tolerating vcpus not being allocated yet. Fixes: 61649709421a5a7c1 ("xen/domain: Allocate d->vcpu[] in domain_create()") Reported-by: Bertrand Marquis <bertrand.marq...@arm.com> Signed-off-by: Juergen Gross <jgr...@suse.com> --- xen/common/sched/core.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c index 8d178baf3d..79c9100680 100644 --- a/xen/common/sched/core.c +++ b/xen/common/sched/core.c @@ -671,6 +671,10 @@ int sched_move_domain(struct domain *d, struct cpupool *c) for ( unit_idx = 0; unit_idx < n_units; unit_idx++ ) { + /* Special case for move at domain creation time. */ + if ( !d->vcpu[unit_idx * gran] ) + break; + unit = sched_alloc_unit_mem(); if ( unit ) { -- 2.26.2
OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key
OpenPGP_signature
Description: OpenPGP digital signature