On 07.05.2024 15:45, Andrew Cooper wrote:
> Ever since Xen 4.14, there has been a latent bug with migration.
>
> While some toolstacks can level the features properly, they don't shink
> feat.max_subleaf when all features have been dropped. This is because
> we *still* have not completed the toolstack side work for full CPU Policy
> objects.
>
> As a consequence, even when properly feature levelled, VMs can't migrate
> "backwards" across hardware which reduces feat.max_subleaf. One such example
> is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0).
>
> Extend the max policies feat.max_subleaf to the hightest number Xen knows
> about, but leave the default policies matching the host. This will allow VMs
> with a higher feat.max_subleaf than strictly necessary to migrate in.
>
> Eventually we'll manage to teach the toolstack how to avoid creating such VMs
> in the first place, but there's still more work to do there.
Can you explain to me in how far "x86/CPUID: shrink max_{,sub}leaf fields
according to actual leaf contents" would not already have taken care of
this (and not just for sub-leaves of leaf 7), if only it (at least its
more recent versions) was ever seriously looked at? I realize there was
one todo item left there (addressing of which I could probably have used
some help with), but that shouldn't have entirely prevented any progress.
(If I'm not mistaken an earlier version had once gone in, but then needed
to be reverted.)
Jan