On 14.09.2023 11:04, Julien Grall wrote:
> On 14/09/2023 07:32, Jan Beulich wrote:
>> On 13.09.2023 19:56, Julien Grall wrote:
>>> On 11/09/2023 16:01, Jan Beulich wrote:
>>>> [1] specifies a long list of instructions which are intended to exhibit
>>>> timing behavior independent of the data they operate on. On certain
>>>> hardware this independence is optional, controlled by a bit in a new
>>>> MSR. Provide a command line option to control the mode Xen and its
>>>> guests are to operate in, with a build time control over the default.
>>>> Longer term we may want to allow guests to control this.
>>>
>>> If I read correctly the discussion on the previous version. There was
>>> some concern with this version of patch. I can't find anything public
>>> suggesting that the concerned were dismissed. Do you have any pointer?
>>
>> Well, I can point to the XenServer patch queue, which since then has
>> gained a patch of similar (less flexible) functionality (and seeing
>> the similarity I was puzzled by this patch, which was posted late
>> for 4.17, not having gone in quite some time ago). This to me
>> demonstrates that the original concerns were "downgraded". It would
>> of course still be desirable to have guests control the behavior for
>> themselves, but that's a more intrusive change left for the future.
>>
>> Beyond that I guess I need to have Andrew speak for himself.
>>
>>>> Since Arm64 supposedly also has such a control, put command line option
>>>> and Kconfig control in common files.
>>>>
>>>> [1]
>>>> https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/best-practices/data-operand-independent-timing-isa-guidance.html
>>>>
>>>> Requested-by: Demi Marie Obenour <[email protected]>
>>>> Signed-off-by: Jan Beulich <[email protected]>
>>>> ---
>>>> This may be viewed as a new feature, and hence be too late for 4.18. It
>>>> may, however, also be viewed as security relevant, which is why I'd like
>>>> to propose to at least consider it.
>>>>
>>>> Slightly RFC, in particular for whether the Kconfig option should
>>>> default to Y or N.
>>>
>>> I am not entirely sure. I understand that DIT will help in the
>>> cryptographic case but it is not clear to me what is the performance impact.
>>
>> The entire purpose of the bit is to have a performance impact, slowing
>> down execution when fastpaths could be taken, but shouldn't to achieve
>> the named goal.
>
> I understood that. My question was more related to how much it will
> degrade the performance. This will help to know whether the default
> should be yes or no.
Well, as said - I have no information beyond that to be found at the
provided URL.
>>>> --- a/xen/arch/x86/cpu/common.c
>>>> +++ b/xen/arch/x86/cpu/common.c
>>>> @@ -204,6 +204,28 @@ void ctxt_switch_levelling(const struct
>>>> alternative_vcall(ctxt_switch_masking, next);
>>>> }
>>>>
>>>> +static void setup_doitm(void)
>>>> +{
>>>> + uint64_t msr;
>>>> +
>>>> + if ( !cpu_has_doitm )
>>>
>>> I am not very familiar with the feature. If it is not present, then can
>>> we guarantee that the instructions will be executed in constant time?
>>
>> _We_ cannot guarantee anything. It is only hardware vendors who can. As a
>> result I can only again refer you to the referenced documentation. It
>> claims that without the bit being supported in hardware, an extensive
>> list of insns is going to expose data operand independent performance.
>
> Right. So ...
>
>>
>>> If not, I think we should taint Xen and/or print a message if the admin
>>> requested to use DIT. This would make clear that the admin is trying to
>>> do something that doesn't work.
>>
>> Tainting Xen on all hardware not exposing the bit would seem entirely
>> unreasonable to me. If we learned of specific cases where the vendor
>> promises are broken, we could taint there, sure. I guess that would be
>> individual XSAs / CVEs then for each instance.
>
> ... we need to somehow let the user know that the HW it is running on
> may not honor DIT. Tainting might be a bit too harsh, but I think
> printing a
> message with warning_add() is necessary.
But Intel's claim is that with the bit unavailable hardware behaves as
if DIT was in effect. Therefore you're still suggesting to emit a
warning on most of Intel's hardware and on all non-Intel one. That's
going too far, imo.
Jan