On 21.08.2025 18:20, Andrew Cooper wrote:
> On 19/08/2025 5:23 pm, Jan Beulich wrote:
>> On 19.08.2025 15:52, Andrew Cooper wrote:
>>> On 18/08/2025 12:27 pm, Jan Beulich wrote:
>>>> On 15.08.2025 22:41, Andrew Cooper wrote:
>>>>> ... on capable toolchains.
>>>>>
>>>>> This avoids needing to hold rc in a register across the RDMSR, and in most
>>>>> cases removes direct testing and branching based on rc, as the fault 
>>>>> label can
>>>>> be rearranged to directly land on the out-of-line block.
>>>>>
>>>>> There is a subtle difference in behaviour.  The old behaviour would, on 
>>>>> fault,
>>>>> still produce 0's and write to val.
>>>>>
>>>>> The new behaviour only writes val on success, and write_msr() is the only
>>>>> place where this matters.  Move temp out of switch() scope and initialise 
>>>>> it
>>>>> to 0.
>>>> But what's the motivation behind making this behavioral change? At least in
>>>> the cases where the return value isn't checked, it would feel safer if we
>>>> continued clearing the value. Even if in all cases where this could matter
>>>> (besides the one you cover here) one can prove correctness by looking at
>>>> surrounding code.
>>> I didn't realise I'd made a change at first, but it's a consequence of
>>> the compiler's ability to rearrange basic blocks.
>>>
>>> It can be fixed with ...
>>>
>>>>> --- a/xen/arch/x86/include/asm/msr.h
>>>>> +++ b/xen/arch/x86/include/asm/msr.h
>>>>> @@ -55,6 +55,24 @@ static inline void wrmsrns(uint32_t msr, uint64_t val)
>>>>>  /* rdmsr with exception handling */
>>>>>  static inline int rdmsr_safe(unsigned int msr, uint64_t *val)
>>>>>  {
>>>>> +#ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT
>>>>> +    uint64_t lo, hi;
>>>>> +    asm_inline goto (
>>>>> +        "1: rdmsr\n\t"
>>>>> +        _ASM_EXTABLE(1b, %l[fault])
>>>>> +        : "=a" (lo), "=d" (hi)
>>>>> +        : "c" (msr)
>>>>> +        :
>>>>> +        : fault );
>>>>> +
>>>>> +    *val = lo | (hi << 32);
>>>>> +
>>>>> +    return 0;
>>>>> +
>>>>> + fault:
>>>     *val = 0;
>>>
>>> here, but I don't want to do this.  Because val is by pointer and
>>> generally spilled to the stack, the compiler can't optimise away the store.
>> But the compiler is dealing with such indirection in inline functions just
>> fine. I don't expect it would typically spill val to the stack. Is there
>> anything specific here that you think would make this more likely?
> 
> Yes.  The design of the functions they're used in.  Adding this line
> results in:
> 
> add/remove: 0/0 grow/shrink: 7/2 up/down: 109/-36 (73)
> Function                                     old     new   delta
> read_msr                                    1243    1307     +64
> resource_access                              326     341     +15
> hwp_init_msrs.cold                           297     308     +11
> probe_cpuid_faulting                         168     175      +7
> svm_msr_read_intercept                      1034    1039      +5
> hwp_write_request                            113     117      +4
> hwp_init_msrs                                371     374      +3
> amd_log_freq                                 844     828     -16
> guest_rdmsr                                 2168    2148     -20
> 
> Taking read_msr() as a concrete example, this is because it's a store
> into a parent functions variable, not into a local variable, and cannot
> be elided.
> 
> 
>>
>>> I'd far rather get a real compiler error, than to have logic relying on
>>> the result of a faulting MSR read.
>> A compiler error? (Hmm, perhaps you think of uninitialized variable
>> diagnostics. That may or may not trigger, depending on how else the
>> caller's variable is used.)
> 
> Yes I was referring to the uninitialised variable diagnostic.  *_safe()
> are fairly rare, and we've got plenty of coverage in CI.

Well, okay, slightly hesitantly
Reviewed-by: Jan Beulich <jbeul...@suse.com>
preferably with the paragraph in the description that I commented on
slightly expanded to cover the "why" aspect.

Jan

Reply via email to