On 13/12/2022 11:36 am, Jan Beulich wrote:
> With phys_to_nid() now actively checking that a valid node ID is on
> record, the two uses in paging_init() can actually trigger at least the
> 2nd of the assertions there. They're used to calculate allocation flags,
> but the calculated flags wouldn't be used when dealing with an invalid
> (unpopulated) address range. Defer the calculations such that they can
> be done with a validated MFN in hands. This also does away with the
> artificial calculations of an address to pass to phys_to_nid().
>
> Note that while the variable is provably written before use, at least
> some compiler versions can't actually verify that. Hence the variable
> also needs to gain a (dead) initializer.

I'm not surprised in the slightest that GCC can't prove that it is
always initialised.  I suspect a lot of humans would struggle too.

> Fixes: e9c72d524fbd ("xen/x86: Use ASSERT instead of VIRTUAL_BUG_ON for 
> phys_to_nid")
> Signed-off-by: Jan Beulich <[email protected]>

This does appear to fix things.  (Testing hasn't finished yet, but all
systems have installed, and they didn't get that far previously).

> ---
> RFC: With small enough a NUMA hash shift it would still be possible to
>      hit an SRAT hole, despite mfn_valid() passing. Hence, like was the
>      original plan, it may still be necessary to relax the checking in
>      phys_to_nid() (or its designated replacements). At which point the
>      value of this change here would shrink to merely reducing the
>      chance of unintentionally doing NUMA_NO_NODE allocations.

Why does the NUMA shift matter?  Can't this occur for badly constructed
SRAT tables too?


Nevertheless, this is a clear improvement over what's currently in tree,
so I'm going to commit it to try and unblock OSSTest.  The tree has been
blocked for too long.  Further adjustments can come in due course.

~Andrew

Reply via email to