On 04.12.23 15:39, Jan Beulich wrote:
On 04.12.2023 15:18, George Dunlap wrote:
On Mon, Dec 4, 2023 at 2:10 PM Juergen Gross <[email protected]> wrote:

On 04.12.23 14:00, George Dunlap wrote:
On Mon, Dec 4, 2023 at 10:57 AM Jan Beulich <[email protected]> wrote:

It is only in the error case that we want to clean up the new pool's
scheduler data; in the success case it's rather the old scheduler's
data which needs cleaning up.

Reported-by: René Winther Højgaard <[email protected]>
Signed-off-by: Jan Beulich <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>

--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -810,7 +810,7 @@ int sched_move_domain(struct domain *d,
       for ( unit = old_units; unit; )
       {
           if ( unit->priv )
-            sched_free_udata(c->sched, unit->priv);
+            sched_free_udata(ret ? c->sched : old_ops, unit->priv);
           old_unit = unit;
           unit = unit->next_in_list;
           xfree(old_unit);

This code is unfortunately written in a "clever" way which seems to
have introduced some confusion.  The one place which calls "goto
out_free" goes through and replaces *most* of the "old_*" variables
with the "new" equivalents.  That's why we're iterating over
`old_units` even on the failure path.

The result is that this change doesn't catch another bug on the
following line, in the error case:

sched_free_domdata(old_ops, old_domdata);

At this point, old_ops is still the old ops, but old_domdata is the
*new* domdata.

A patch like the following (compile tested only) would fix it along
the lines of the original intent:
8<-------
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index eba0cea4bb..78f21839d3 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -720,6 +720,7 @@ int sched_move_domain(struct domain *d, struct cpupool *c)
           {
               old_units = new_units;
               old_domdata = domdata;
+            old_ops = c->sched;
               ret = -ENOMEM;
               goto out_free;
           }
@@ -809,10 +810,15 @@ int sched_move_domain(struct domain *d, struct cpupool *c)
       domain_unpause(d);

    out_free:
+    /*
+     * NB if we've jumped here, "old_units", "old_ops", and so on will
+     * actually be pointing to the new ops, since when aborting it's
+     * the new ops we want to free.
+     */
       for ( unit = old_units; unit; )
       {
           if ( unit->priv )
-            sched_free_udata(c->sched, unit->priv);
+            sched_free_udata(old_ops, unit->priv);
           old_unit = unit;
           unit = unit->next_in_list;
           xfree(old_unit);
---->8

But given that this kind of cleverness has already fooled two of our
most senior developers, I'd suggest making the whole thing more
explicit; something like the attached (again compile-tested only)?

And I have again a third approach, making it crystal clear what is happening
with which data. No need to explain what is freed via which variables. See
attached patch (this time it should be really there).

Yes, I thought about making a function as well -- that works for me too.

Personally I prefer to keep the "goto out", rather than duplicating
the rcu_read_unlock().  I'd yield if Jan said he preferred
duplication, however.

I'm on the edge there actually.

In this case I'd prefer it my way, as it avoids having to scroll down to the
out: label to see what is happening there. Additionally it enables to get rid
of the ret variable.

In the end the main part of the patch is the new function, so I'm not really
feeling strong regarding the dropping of "goto out".


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to