On 12.08.2019 17:10, Andrew Cooper wrote:
mov/shr is easier to follow than shld, and doesn't have a merge dependency on
the previous value of %edx. Shorten the rest of the code by streamlining the
comments.
Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
---
CC: Jan Beulich <jbeul...@suse.com>
CC: Wei Liu <w...@xen.org>
CC: Roger Pau Monné <roger....@citrix.com>
In addition to being clearer to follow, mov/shr is faster than shld to decode
and execute. See https://godbolt.org/z/A5kvuC for the latency/throughput/port
analysis, the Intel Optimisation guide which classifes them as "Slow Int"
instructions, or the AMD Optimisation guide which specifically has a section
entitled "Alternatives to SHLD Instruction".
I don't really mind the change, but I don't think performance is a
concern here. Instead I think we want to size-optimize the trampoline
as much as possible, which is why (iirc) I had asked for the use of
SHLD here. Considering David's work to split boot and permanent
trampoline I'm find with the minimal 1 byte increase though:
Reviewed-by: Jan Beulich <jbeul...@suse.com>
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel