On Tue, May 05, 2020 at 07:25:39PM +0100, Andrew Cooper wrote:
> AMD Fam15h processors introduced the flush-by-asid feature, for more fine
> grain flushing purposes.
> 
> Flushing everything including ASID 0 (i.e. Xen context) is an an unnecesserily
> large hammer, and never necessary in the context of guest TLBs needing
> invalidating.
> 
> When available, use TLB_CTRL_FLUSH_ASID in preference to TLB_CTRL_FLUSH_ALL.
> 
> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>

Reviewed-by: Roger Pau Monné <roger....@citrix.com>

I should also look into restricting the usage of FLUSH_HVM_ASID_CORE
and instead perform more fine grained per-vCPU flushes when possible,
since FLUSH_HVM_ASID_CORE resets the pCPU ASID generation forcing a
new ASID to be allocated for all vCPUs running on that pCPU.

> ---
> CC: Jan Beulich <jbeul...@suse.com>
> CC: Wei Liu <w...@xen.org>
> CC: Roger Pau Monné <roger....@citrix.com>
> 
> The APM currently describes tlb_control encoding 1 as "Flush entire
> TLB (Should be used only by legacy hypervisors.)".  AMD have agreed that this
> is misleading and should say "legacy hardware" instead.

AFAICT using TLB_CTRL_FLUSH_ASID on hardware not supporting the
feature has not been found to be safe? Ie: TLB_CTRL_FLUSH_ALL is a
subset of TLB_CTRL_FLUSH_ASID from a bitmap PoV, so if those bits
where ignored on older hardware it should be safe to unconditionally
use it.

> This change does come with a minor observed perf improvement on Fam17h
> hardware, of ~0.6s over ~22s for my XTF pagewalk test.  This test will spot
> TLB flushing issues, but isn't optimal for spotting the perf increase from
> better flushing.  There were no observed differences for Fam15h, but this
> could simply mean that the measured code footprint was larger than the TLB on
> this CPU.
> ---
>  xen/arch/x86/hvm/svm/asid.c       | 9 ++++++---
>  xen/include/asm-x86/hvm/svm/svm.h | 1 +
>  2 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/svm/asid.c b/xen/arch/x86/hvm/svm/asid.c
> index 9be90058c7..ab06dd3f3a 100644
> --- a/xen/arch/x86/hvm/svm/asid.c
> +++ b/xen/arch/x86/hvm/svm/asid.c
> @@ -18,6 +18,7 @@
>  #include <asm/amd.h>
>  #include <asm/hvm/nestedhvm.h>
>  #include <asm/hvm/svm/asid.h>
> +#include <asm/hvm/svm/svm.h>
>  
>  void svm_asid_init(const struct cpuinfo_x86 *c)
>  {
> @@ -47,15 +48,17 @@ void svm_asid_handle_vmrun(void)
>      if ( p_asid->asid == 0 )
>      {
>          vmcb_set_guest_asid(vmcb, 1);
> -        /* TODO: investigate using TLB_CTRL_FLUSH_ASID here instead. */
> -        vmcb->tlb_control = TLB_CTRL_FLUSH_ALL;
> +        vmcb->tlb_control =
> +            cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID : 
> TLB_CTRL_FLUSH_ALL;
>          return;
>      }
>  
>      if ( vmcb_get_guest_asid(vmcb) != p_asid->asid )
>          vmcb_set_guest_asid(vmcb, p_asid->asid);
>  
> -    vmcb->tlb_control = need_flush ? TLB_CTRL_FLUSH_ALL : TLB_CTRL_NO_FLUSH;
> +    vmcb->tlb_control =
> +        !need_flush ? TLB_CTRL_NO_FLUSH :
> +        cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID : TLB_CTRL_FLUSH_ALL;

Since this code structure is used in two places I would consider
locally introducing something like:

#define TLB_CTRL_FLUSH_CMD (cpu_has_svm_flushbyasid ? TLB_CTRL_FLUSH_ASID \
                                                    : TLB_CTRL_FLUSH_ALL)

To abstract it away.

Thanks, Roger.

Reply via email to