On 04.12.23 14:00, George Dunlap wrote:
On Mon, Dec 4, 2023 at 10:57 AM Jan Beulich <[email protected]> wrote:It is only in the error case that we want to clean up the new pool's scheduler data; in the success case it's rather the old scheduler's data which needs cleaning up. Reported-by: René Winther Højgaard <[email protected]> Signed-off-by: Jan Beulich <[email protected]> Reviewed-by: Juergen Gross <[email protected]> --- a/xen/common/sched/core.c +++ b/xen/common/sched/core.c @@ -810,7 +810,7 @@ int sched_move_domain(struct domain *d, for ( unit = old_units; unit; ) { if ( unit->priv ) - sched_free_udata(c->sched, unit->priv); + sched_free_udata(ret ? c->sched : old_ops, unit->priv); old_unit = unit; unit = unit->next_in_list; xfree(old_unit);This code is unfortunately written in a "clever" way which seems to have introduced some confusion. The one place which calls "goto out_free" goes through and replaces *most* of the "old_*" variables with the "new" equivalents. That's why we're iterating over `old_units` even on the failure path. The result is that this change doesn't catch another bug on the following line, in the error case: sched_free_domdata(old_ops, old_domdata); At this point, old_ops is still the old ops, but old_domdata is the *new* domdata. A patch like the following (compile tested only) would fix it along the lines of the original intent: 8<------- diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c index eba0cea4bb..78f21839d3 100644 --- a/xen/common/sched/core.c +++ b/xen/common/sched/core.c @@ -720,6 +720,7 @@ int sched_move_domain(struct domain *d, struct cpupool *c) { old_units = new_units; old_domdata = domdata; + old_ops = c->sched; ret = -ENOMEM; goto out_free; } @@ -809,10 +810,15 @@ int sched_move_domain(struct domain *d, struct cpupool *c) domain_unpause(d); out_free: + /* + * NB if we've jumped here, "old_units", "old_ops", and so on will + * actually be pointing to the new ops, since when aborting it's + * the new ops we want to free. + */ for ( unit = old_units; unit; ) { if ( unit->priv ) - sched_free_udata(c->sched, unit->priv); + sched_free_udata(old_ops, unit->priv); old_unit = unit; unit = unit->next_in_list; xfree(old_unit); ---->8 But given that this kind of cleverness has already fooled two of our most senior developers, I'd suggest making the whole thing more explicit; something like the attached (again compile-tested only)?
And I have again a third approach, making it crystal clear what is happening with which data. No need to explain what is freed via which variables. See attached patch (this time it should be really there). Thoughts? Juergen
OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key
From b8313d7060e9499fa7580265bda703cce05fe46a Mon Sep 17 00:00:00 2001 From: Juergen Gross <[email protected]> Date: Mon, 4 Dec 2023 14:15:45 +0100 Subject: [PATCH] xen/sched: fix sched_move_domain() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Do cleanup in sched_move_domain() in a dedicated service function, which is called either in error case with newly allocated data, or in success case with the old data to be freed. This will at once fix some subtle bugs which sneaked in due to forgetting to overwrite some pointers in the error case. Fixes: 70fadc41635b ("xen/cpupool: support moving domain between cpupools with different granularity") Reported-by: René Winther Højgaard <[email protected]> Initial-fix-by: Jan Beulich <[email protected]> Initial-fix-by: George Dunlap <[email protected]> Signed-off-by: Juergen Gross <[email protected]> --- xen/common/sched/core.c | 47 +++++++++++++++++++++++------------------ 1 file changed, 27 insertions(+), 20 deletions(-) diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c index eba0cea4bb..fd92e9b13a 100644 --- a/xen/common/sched/core.c +++ b/xen/common/sched/core.c @@ -647,6 +647,24 @@ static void sched_move_irqs(const struct sched_unit *unit) vcpu_move_irqs(v); } +static void sched_move_domain_cleanup(struct scheduler *ops, + struct sched_unit *units, + void *domdata) +{ + struct sched_unit *unit, *old_unit; + + for ( unit = units; unit; ) + { + if ( unit->priv ) + sched_free_udata(ops, unit->priv); + old_unit = unit; + unit = unit->next_in_list; + xfree(old_unit); + } + + sched_free_domdata(ops, domdata); +} + /* * Move a domain from one cpupool to another. * @@ -686,7 +704,6 @@ int sched_move_domain(struct domain *d, struct cpupool *c) void *old_domdata; unsigned int gran = cpupool_get_granularity(c); unsigned int n_units = d->vcpu[0] ? DIV_ROUND_UP(d->max_vcpus, gran) : 0; - int ret = 0; for_each_vcpu ( d, v ) { @@ -699,8 +716,9 @@ int sched_move_domain(struct domain *d, struct cpupool *c) domdata = sched_alloc_domdata(c->sched, d); if ( IS_ERR(domdata) ) { - ret = PTR_ERR(domdata); - goto out; + rcu_read_unlock(&sched_res_rculock); + + return PTR_ERR(domdata); } for ( unit_idx = 0; unit_idx < n_units; unit_idx++ ) @@ -718,10 +736,10 @@ int sched_move_domain(struct domain *d, struct cpupool *c) if ( !unit || !unit->priv ) { - old_units = new_units; - old_domdata = domdata; - ret = -ENOMEM; - goto out_free; + sched_move_domain_cleanup(c->sched, new_units, domdata); + rcu_read_unlock(&sched_res_rculock); + + return -ENOMEM; } unit_ptr = &unit->next_in_list; @@ -808,22 +826,11 @@ int sched_move_domain(struct domain *d, struct cpupool *c) domain_unpause(d); - out_free: - for ( unit = old_units; unit; ) - { - if ( unit->priv ) - sched_free_udata(c->sched, unit->priv); - old_unit = unit; - unit = unit->next_in_list; - xfree(old_unit); - } - - sched_free_domdata(old_ops, old_domdata); + sched_move_domain_cleanup(old_ops, old_units, old_domdata); - out: rcu_read_unlock(&sched_res_rculock); - return ret; + return 0; } void sched_destroy_vcpu(struct vcpu *v) -- 2.35.3
OpenPGP_signature.asc
Description: OpenPGP digital signature
