On 21.08.2025 18:20, Andrew Cooper wrote: > On 19/08/2025 5:23 pm, Jan Beulich wrote: >> On 19.08.2025 15:52, Andrew Cooper wrote: >>> On 18/08/2025 12:27 pm, Jan Beulich wrote: >>>> On 15.08.2025 22:41, Andrew Cooper wrote: >>>>> ... on capable toolchains. >>>>> >>>>> This avoids needing to hold rc in a register across the RDMSR, and in most >>>>> cases removes direct testing and branching based on rc, as the fault >>>>> label can >>>>> be rearranged to directly land on the out-of-line block. >>>>> >>>>> There is a subtle difference in behaviour. The old behaviour would, on >>>>> fault, >>>>> still produce 0's and write to val. >>>>> >>>>> The new behaviour only writes val on success, and write_msr() is the only >>>>> place where this matters. Move temp out of switch() scope and initialise >>>>> it >>>>> to 0. >>>> But what's the motivation behind making this behavioral change? At least in >>>> the cases where the return value isn't checked, it would feel safer if we >>>> continued clearing the value. Even if in all cases where this could matter >>>> (besides the one you cover here) one can prove correctness by looking at >>>> surrounding code. >>> I didn't realise I'd made a change at first, but it's a consequence of >>> the compiler's ability to rearrange basic blocks. >>> >>> It can be fixed with ... >>> >>>>> --- a/xen/arch/x86/include/asm/msr.h >>>>> +++ b/xen/arch/x86/include/asm/msr.h >>>>> @@ -55,6 +55,24 @@ static inline void wrmsrns(uint32_t msr, uint64_t val) >>>>> /* rdmsr with exception handling */ >>>>> static inline int rdmsr_safe(unsigned int msr, uint64_t *val) >>>>> { >>>>> +#ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT >>>>> + uint64_t lo, hi; >>>>> + asm_inline goto ( >>>>> + "1: rdmsr\n\t" >>>>> + _ASM_EXTABLE(1b, %l[fault]) >>>>> + : "=a" (lo), "=d" (hi) >>>>> + : "c" (msr) >>>>> + : >>>>> + : fault ); >>>>> + >>>>> + *val = lo | (hi << 32); >>>>> + >>>>> + return 0; >>>>> + >>>>> + fault: >>> *val = 0; >>> >>> here, but I don't want to do this. Because val is by pointer and >>> generally spilled to the stack, the compiler can't optimise away the store. >> But the compiler is dealing with such indirection in inline functions just >> fine. I don't expect it would typically spill val to the stack. Is there >> anything specific here that you think would make this more likely? > > Yes. The design of the functions they're used in. Adding this line > results in: > > add/remove: 0/0 grow/shrink: 7/2 up/down: 109/-36 (73) > Function old new delta > read_msr 1243 1307 +64 > resource_access 326 341 +15 > hwp_init_msrs.cold 297 308 +11 > probe_cpuid_faulting 168 175 +7 > svm_msr_read_intercept 1034 1039 +5 > hwp_write_request 113 117 +4 > hwp_init_msrs 371 374 +3 > amd_log_freq 844 828 -16 > guest_rdmsr 2168 2148 -20 > > Taking read_msr() as a concrete example, this is because it's a store > into a parent functions variable, not into a local variable, and cannot > be elided. > > >> >>> I'd far rather get a real compiler error, than to have logic relying on >>> the result of a faulting MSR read. >> A compiler error? (Hmm, perhaps you think of uninitialized variable >> diagnostics. That may or may not trigger, depending on how else the >> caller's variable is used.) > > Yes I was referring to the uninitialised variable diagnostic. *_safe() > are fairly rare, and we've got plenty of coverage in CI.
Well, okay, slightly hesitantly Reviewed-by: Jan Beulich <jbeul...@suse.com> preferably with the paragraph in the description that I commented on slightly expanded to cover the "why" aspect. Jan