On 19/08/2025 2:19 pm, Jan Beulich wrote:
> On 15.08.2025 22:41, Andrew Cooper wrote:
>> It turns out that using the higher level helpers adjacent like this leads to
>> terrible code generation.  Due to -fno-strict-alising, the store into state->
>> invalidates the read_cr4() address calculation (which is really cpu_info->cr4
>> under the hood), meaning that it can't be hoisted.
>>
>> As a result we get "locate the top of stack block, get cr4, and see if
>> FSGSBASE is set" repeated 3 times, and an unreasoanble number of basic 
>> blocks.
>>
>> Hoist the calculation manually, which results in two basic blocks.
>>
>> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
> Otoh the function here isn't really performance or size critical. I'm 
> undecided
> whether the undesirable open-coding or the bad code gen are the lesser evil.

This function no, but every other place touching FS and GS is
performance critical.  They're all messy to start with, and get worse
under FRED.

~Andrew

Reply via email to