It turns out that using the higher level helpers adjacent like this leads to
terrible code generation.  Due to -fno-strict-alising, the store into state->
invalidates the read_cr4() address calculation (which is really cpu_info->cr4
under the hood), meaning that it can't be hoisted.

As a result we get "locate the top of stack block, get cr4, and see if
FSGSBASE is set" repeated 3 times, and an unreasoanble number of basic blocks.

Hoist the calculation manually, which results in two basic blocks.

Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
---
CC: Jan Beulich <jbeul...@suse.com>
CC: Roger Pau Monné <roger....@citrix.com>

Side-by-side disassembly: https://termbin.com/9xfq
---
 xen/arch/x86/traps.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 0c5393cb2166..8c261d219c07 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -118,9 +118,18 @@ static void read_registers(struct extra_state *state)
     state->cr3 = read_cr3();
     state->cr4 = read_cr4();
 
-    state->fsb = read_fs_base();
-    state->gsb = read_gs_base();
-    state->gss = read_gs_shadow();
+    if ( state->cr4 & X86_CR4_FSGSBASE )
+    {
+        state->fsb = __rdfsbase();
+        state->gsb = __rdgsbase();
+        state->gss = __rdgskern();
+    }
+    else
+    {
+        state->fsb = rdmsr(MSR_FS_BASE);
+        state->gsb = rdmsr(MSR_GS_BASE);
+        state->gss = rdmsr(MSR_SHADOW_GS_BASE);
+    }
 
     asm ( "mov %%ds, %0" : "=m" (state->ds) );
     asm ( "mov %%es, %0" : "=m" (state->es) );
-- 
2.39.5


Reply via email to