On 22/04/2025 10:07 am, Jan Beulich wrote:
> On 17.04.2025 19:24, Andrew Cooper wrote:
>> On 07/04/2025 4:48 pm, Jan Beulich wrote:
>>> On 07.04.2025 17:35, Andrew Cooper wrote:
>>>> Unlike the WRMSR side, we can't use asm goto() unconditionally, because our
>>>> toolchain baseline doesn't support asm goto with outputs.
>>> Is there actually a benefit we gain from now needing to maintain two 
>>> different
>>> pieces of logic fulfilling the same purpose?
>> IMO, yes.  Besides getting rid of .fixup/unlikely, the code generation
>> is better-enough to warrant it, including getting the common path
>> correct (the referenced labels are all considered cold).
>>
>> e.g. for this change, we go from:
>>
>>     xor    %esi,%esi
>>     rdmsr
>>     test   %esi,%esi
>>     jne    <init_amd+0x540>
>>     and    $0xfffffffe,%edx
>>     wrmsr
>>
>> (note the forward branch) to simply:
>>
>>     rdmsr
>>     and    $0xfffffffe,%edx
>>     wrmsr
>>
>> because the exception table redirect is directly into init_amd.cold, and
>> we don't have to hold `int err` in a register across the asm() block.
>>
>> This is an intentionally simple example to get the infrastructure in,
>> but vmread() will definitely benefit.
>>
>>>> Also, there's a different errata workaround we'll need if we want to use 
>>>> asm
>>>> goto() with "+" constraints:
>>>>
>>>> config CC_HAS_ASM_GOTO_TIED_OUTPUT
>>>>    depends on CC_HAS_ASM_GOTO_OUTPUT
>>>>    # Detect buggy gcc and clang, fixed in gcc-11 clang-14.
>>>>    def_bool $(success,echo 'int foo(int *x) { asm goto (".long (%l[bar]) - 
>>>> .": "+m"(*x) ::: bar); return *x; bar: return 0; }' | $CC -x c - -c -o 
>>>> /dev/null)
>>>>
>>>> I'm tempted to put it in straight away, lest we forget about it.
>>> Perhaps best if we really want to go this route. Yet then - why "TIED"? 
>>> Isn't
>>> "tied" the term they use when referring to an earlier operand by using a
>>> digit (or the operand's name in square brackets)?
>> This is straight from Linux.  I've not looked at the issue in detail.
> So what I see is that Sean in Linux commit 1aa0e8b144b6 uses this term also
> in the description. I'm unconvinced it's correct, though. Gcc doc doesn't
> call the "+" modifier anything special, and it calls the numeric constraints
> (for which "+" can be a shorthand in certain cases) "matching constraint".
>
> We can of course sort the naming in the eventual patch pulling in that
> behavior, yet I'd like to suggest already now that we don't blindly follow
> Linux'es naming (unless the choice can be backed by some doc reference).

In the meantime, I found
https://clang.llvm.org/docs/LanguageExtensions.html#asm-goto-with-output-constraints

"When using tied-outputs (i.e. outputs that are inputs and outputs, not
just outputs) with the +r constraint"

So Clang does name this case specifically as tied outputs.

>
>>>> --- a/xen/Kconfig
>>>> +++ b/xen/Kconfig
>>>> @@ -41,6 +41,20 @@ config CC_SPLIT_SECTIONS
>>>>  config CC_HAS_UBSAN
>>>>    def_bool $(cc-option,-fsanitize=undefined)
>>>>  
>>>> +# Fixed in GCC 14, 13.3, 12.4 and 11.5
>>>> +# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921
>>>> +config GCC_ASM_GOTO_OUTPUT_BROKEN
>>>> +  bool
>>>> +  depends on CC_IS_GCC
>>>> +  default y if GCC_VERSION < 110500
>>>> +  default y if GCC_VERSION >= 120000 && GCC_VERSION < 120400
>>>> +  default y if GCC_VERSION >= 130000 && GCC_VERSION < 130300
>>> Unlike for pre-release versions (x.0.y) I view this as problematic. Distros
>>> are likely to have backported the fix before the minor releases took place.
>>> Or they may have backported without ever meaning to follow later minor
>>> releases. We'd needlessly exclude them here. Imo ...
>>>
>>>> +config CC_HAS_ASM_GOTO_OUTPUT
>>>> +  def_bool y
>>>> +  depends on !GCC_ASM_GOTO_OUTPUT_BROKEN
>>>> +  depends on $(success,echo 'int foo(int x) { asm goto ("": "=r"(x) ::: 
>>>> bar); return x; bar: return 0; }' | $(CC) -x c - -c -o /dev/null)
>>> ... the only option is to actually probe for support as well as the (non-)
>>> buggy-ness.
>> There is no sensible way to probe.  It compiles fine, but (AIUI) fails
>> to spill registers correctly on some paths, which also makes it very
>> sensitive to other optimisations.
> Hmm, okay, Linux commit f2f6a8e88717 kind of suggests that there might have
> been more issues in gcc. Really I can't help the impression that the issue
> still wasn't fully understood, and hence may re-surface in another context.
> In which case I guess I agree the above is the best we can do for the time
> being, until we learn of further breakage:
> Reviewed-by: Jan Beulich <jbeul...@suse.com>

Thanks.

~Andrew

Reply via email to