On 12/02/18 17:20, Volodymyr Babchuk wrote:
Julien,

Hi,

On 12.02.18 19:12, Julien Grall wrote:
On 12/02/18 16:55, Volodymyr Babchuk wrote:
Hi Julien,

Hi Volodymyr,

On 08.02.18 21:21, Julien Grall wrote:
Add the detection and runtime code for ARM_SMCCC_ARCH_WORKAROUND_1.

Signed-off-by: Julien Grall <julien.gr...@arm.com>

---
     Changes in v2:
         - Patch added
---
  xen/arch/arm/arm64/bpi.S    | 12 ++++++++++++
  xen/arch/arm/cpuerrata.c    | 32 +++++++++++++++++++++++++++++++-
  xen/include/asm-arm/smccc.h |  1 +
  3 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/arm64/bpi.S b/xen/arch/arm/arm64/bpi.S
index 4b7f1dc21f..ef237de7bd 100644
--- a/xen/arch/arm/arm64/bpi.S
+++ b/xen/arch/arm/arm64/bpi.S
@@ -16,6 +16,8 @@
   * along with this program.  If not, see <http://www.gnu.org/licenses/>.
   */
+#include <asm/smccc.h>
+
  .macro ventry target
      .rept 31
      nop
@@ -81,6 +83,16 @@ ENTRY(__psci_hyp_bp_inval_start)
      add     sp, sp, #(8 * 18)
  ENTRY(__psci_hyp_bp_inval_end)
+ENTRY(__smccc_workaround_1_smc_start)
+    sub     sp, sp, #(8 * 4)
+    stp     x2, x3, [sp, #(8 * 0)]
+    stp     x0, x1, [sp, #(8 * 2)]
+    mov     w0, #ARM_SMCCC_ARCH_WORKAROUND_1_FID
+    ldp     x2, x3, [sp, #(8 * 0)]
+    ldp     x0, x1, [sp, #(8 * 2)]
+    add     sp, sp, #(8 * 4)
+ENTRY(__smccc_workaround_1_smc_end)
+

This code confuses me. You allocate 32 bytes on stack, save x0-x4 there, then you load ARM_SMCCC_ARCH_WORKAROUND_1_FID into w0 and restore values of x0-x4, overwriting value written into w0. Am I missing something?

The call to ARM_SMCCC_ARCH_WORKAROUND_1 does not return any value. Even if it were, this code is executed on exception entry before jumping into the trap helper. So you want to restore all the registers saved.

I believe you missed smc instruction in the code above.

Whoops yes. I will fix it.



Btw, you can use something like stp    x0, x1, [sp, #-16]! to avoid manual adjustment of sp. This will save you two instructions.

It was pointed out on Linux Arm that updating sp once *might* be faster on some uarch.

So is this code is targeted for that some specific uarch? Then I would like to see a comment describing why you choose this approach.

I can't confirm whether this will improve uarch A, B, C or Z. I just followed suggestion on Linux Arm (see [1]) and a personal choice on how to write assembly code. It is quite similar that why would I choose the other way around?

Cheers,

[1] https://www.spinics.net/lists/arm-kernel/msg626659.html

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to