On 14/03/2019 20:25, Thomas Gleixner wrote:
> Ashok,
>
> On Thu, 14 Mar 2019, Raj, Ashok wrote:
>> On Thu, Mar 14, 2019 at 12:39:46PM +0000, Andrew Cooper wrote:
>>> On late load failure, we should dump enough information to work out
>>> exactly what went on, to determine how best to proceed, but the server
>>> is effectively lost to us.  On late load success, the proposed new
>>> "version" replaces the current "version".
>>>
>>> And again - I reiterate the point that I think it is fine to have a
>>> simplifying assumption that we don't have mixed stepping systems to
>>> start with, presuming this is generally in line with Intel's support
>>> statement.  If in practice we find mixed stepping systems which are
>>> supported by an OEM/Intel, we can see about extending the logic.
>> Checking with Asit he says it is in fact permitted to have 1 step behind
>> even on a multi-socket system. One could be N and other N-1 should be 
>> supported.
> That turns into a total disaster if N has an issue fixed ant N-1 requires
> microcode + software workaround.
>
> So if N is on the boot socket, then we fail to enable the workaround
> because CPU0 has the 'Issue fixed' bit set.
>
> If N-1 is on the boot socket, then we go to do the workaround nevertheless
> on N and that might dependend on the issue just be some pointless exercise
> or even try to access some MSR which is not available.
>
> *Shudder*

Intel: Are you saying that Skylake (06-55-04) is supported in
combination with Cascade Lake B0 (06-55-05) and/or Cascade Lake B1
(06-55-06) ?

The most insidious problem is TSX_FORCE_ABORT between the two Cascade
Lakes.  There really will be an asymmetric existence of an MSR required
for use in one part of the system, and unavailable in the other part of
the system.

To a certain degree, what is technically supported by Intel is also
tempered by what the major OS/VMM vendors are willing to boot on, as
that is ultimately what the customer is paying for.  When the steppings
differed only by the errata fixed, and the silicon was otherwise
identical from software's point of view, supporting a range of adjacent
steppings seems entirely reasonable.

In this case you've got 3 adjacent steppings, *all* of which offer
different architecturally defined features, and will involve software
changes to allow mixed systems to function in a safe way.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to