On 07/05/2024 3:45 pm, Roger Pau Monné wrote: > On Tue, May 07, 2024 at 03:31:19PM +0100, Andrew Cooper wrote: >> On 07/05/2024 3:24 pm, Roger Pau Monné wrote: >>> On Tue, May 07, 2024 at 02:45:40PM +0100, Andrew Cooper wrote: >>>> Ever since Xen 4.14, there has been a latent bug with migration. >>>> >>>> While some toolstacks can level the features properly, they don't shink >>>> feat.max_subleaf when all features have been dropped. This is because >>>> we *still* have not completed the toolstack side work for full CPU Policy >>>> objects. >>>> >>>> As a consequence, even when properly feature levelled, VMs can't migrate >>>> "backwards" across hardware which reduces feat.max_subleaf. One such >>>> example >>>> is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0). >>>> >>>> Extend the max policies feat.max_subleaf to the hightest number Xen knows >>>> about, but leave the default policies matching the host. This will allow >>>> VMs >>>> with a higher feat.max_subleaf than strictly necessary to migrate in. >>>> >>>> Eventually we'll manage to teach the toolstack how to avoid creating such >>>> VMs >>>> in the first place, but there's still more work to do there. >>>> >>>> Signed-off-by: Andrew Cooper <[email protected]> >>> Acked-by: Roger Pau Monné <[email protected]> >> Thanks. >> >>> Even if we have just found one glitch with PSFD and Ice Lake vs >>> Cascade Lack, wouldn't it be safer to always extend the max policies >>> max leafs and subleafs to match the known array sizes? >> This is the final max leaf (containing feature information) to gain >> custom handling, I think? > Couldn't the same happen with extended leaves? Some of the extended > leaves contain features, and hence for policy leveling toolstack might > decide to zero them, yet extd.max_leaf won't be adjusted.
Hmm. Right now, extd max leaf is also the one with the bit that we unconditionally advertise, and it's inherited all the way from the host policy. So yes, in principle, but anything that bumps this limit is going to have other implications too, and I'd prefer not to second-guess them at this point. I hope we can get the toolstack side fixes before this becomes a real problem... ~Andrew
