Hi Jan, > -----Original Message----- > From: Xen-devel <[email protected]> On Behalf Of Jan > Beulich > Sent: 2022年9月27日 22:14 > To: [email protected] > Cc: Andrew Cooper <[email protected]>; Wei Liu <[email protected]>; Roger > Pau Monné <[email protected]> > Subject: [PATCH] x86/NUMA: correct off-by-1 in node map size calculation > > extract_lsb_from_nodes() accumulates "memtop" from all PDXes one past > the covered ranges. Hence the maximum address which can validly by used > to index the node map is one below this value, and we may currently set > up a node map with an unused (and never initialized) trailing entry. In > boundary cases this may also mean we dynamically allocate a page when > the static (64-entry) map would suffice. > > While there also correct the comment ahead of the function, for it to > match the actual code: Linux commit 54413927f022 ("x86-64: > x86_64-make-the-numa-hash-function-nodemap-allocation fix fix") removed > the ORing in of the end address before we actually cloned their code. > > Signed-off-by: Jan Beulich <[email protected]> > --- > Really the shift value may end up needlessly small when there's > discontiguous memory. Within a gap, any address can be taken for the > node boundary, and hence neither the end of the lower range nor the > start of the higher range necessarily is the best address to use. For > example with these two node ranges (numbers are frame addresses) > > [10000,17fff] > [28000,2ffff] > > we'd calculate the shift as 15 when 16 or even 17 (because the start of > the 1st range can also be ignored) would do. I haven't tried to properly > prove it yet, but it looks to me as if the top bit of the XOR of lower > range (inclusive) end and higher range start would be what would want > accumulating (of course requiring the entries to be sorted, or to be > processed in address order). This would then "naturally" exclude lowest > range start and highest range end. > > --- a/xen/arch/x86/numa.c > +++ b/xen/arch/x86/numa.c > @@ -110,7 +110,7 @@ static int __init allocate_cachealigned_ > } > > /* > - * The LSB of all start and end addresses in the node map is the value of > the > + * The LSB of all start addresses in the node map is the value of the > * maximum possible shift. > */ > static int __init extract_lsb_from_nodes(const struct node *nodes, > @@ -135,7 +135,7 @@ static int __init extract_lsb_from_nodes > i = BITS_PER_LONG - 1; > else > i = find_first_bit(&bitfield, sizeof(unsigned long)*8); > - memnodemapsize = (memtop >> i) + 1; > + memnodemapsize = ((memtop - 1) >> i) + 1; > return i; > } >
Thanks for this fix. Reviewed-by: Wei Chen <[email protected]>
