On 21.08.2025 22:52, Andrew Cooper wrote:
> On 19/08/2025 2:19 pm, Jan Beulich wrote:
>> On 15.08.2025 22:41, Andrew Cooper wrote:
>>> It turns out that using the higher level helpers adjacent like this leads to
>>> terrible code generation.  Due to -fno-strict-alising, the store into 
>>> state->
>>> invalidates the read_cr4() address calculation (which is really 
>>> cpu_info->cr4
>>> under the hood), meaning that it can't be hoisted.
>>>
>>> As a result we get "locate the top of stack block, get cr4, and see if
>>> FSGSBASE is set" repeated 3 times, and an unreasoanble number of basic 
>>> blocks.
>>>
>>> Hoist the calculation manually, which results in two basic blocks.
>>>
>>> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
>> Otoh the function here isn't really performance or size critical. I'm 
>> undecided
>> whether the undesirable open-coding or the bad code gen are the lesser evil.
> 
> This function no, but every other place touching FS and GS is
> performance critical.  They're all messy to start with, and get worse
> under FRED.

Is there any (further) bad effect to the function here by the time all of the
FRED bits are in?

Jan

Reply via email to