From: Haseeb Ashraf <[email protected]>

This patch series addresses a major issue for running Xen on KVM i.e.
costly emulation of VMALLS12E1IS which becomes worse when this TLBI
is invoked too many times. There are mainly two places where this is
problematic:
(a) When vCPUs switch on a pCPU or pCPUs
(b) When domu mapped pages onto dom0, are to be unmapped, then each
    page being removed by XENMEM_remove_from_physmap has its TLBs
    invalidated by the TLBI variant that flushes the whole range.

This patch series prefers usage of IPA-based TLBIs wherever possible
instead of complete flushing of TLBs every time.

It consists of three patches where the first one address the issue
being discussed for Arm64. Second patch further optimizes the
combined stage-1,2 TLB flushes by leveraging FEAT_nTLBPA. Third patch
introduces IPA-based TLBI for Arm32 in presence of FEAT_nTLBPA.

Haseeb Ashraf (3):
  xen/arm/p2m: perform IPA-based TLBI when IPA is known
  xen/arm: optimize stage-1,2 combined TLBI in presence of FEAT_nTLBPA
  xen/arm32: add CPU capability for IPA-based TLBI

Changes in v3:
- Mainly the handling of repeat TLBI workaround with IPA-based TLBI,
  so that the extra TLBI and DSB are repeated only for the final TLBI
  and DSB of the whole sequence.
- Updated code comments as per feedback. Further details are
  available in each commit's changelog.
- Minor updates to code as per feedback. Further details are
  available in each commit's changelog.

Changes in v2:
- Split up the commit in 3 commits. First commit implements the
  baseline implementation without any addition of new CPU
  capabilities. Implemented new CPU caps in separate features to
  emphasize how each of it optimizes the TLB invalidation.
- Moved ARM32 and ARM64 specific implementations of TLBIs to
  architecture specific flushtlb.h.
- Added references of ARM ARM in code comments.
- Evaluated and added a threshold to select between IPA-based TLB
  invalidation vs fallback to full stage TLB invalidation above
  the threshold.
- Introduced ARM_HAS_NTLBPA CPU capability which leverages
  FEAT_nTLBPA for arm32 as well as arm64.
- Introduced ARM_HAS_TLB_IPA CPU capability for IPA-based TLBI
  for arm32.

Haseeb Ashraf (3):
  xen/arm/p2m: perform IPA-based TLBI when IPA is known
  xen/arm: optimize stage-1,2 combined TLBI in presence of FEAT_nTLBPA
  xen/arm32: add CPU capability for IPA-based TLBI

 xen/arch/arm/cpufeature.c                 | 31 ++++++++
 xen/arch/arm/include/asm/arm32/flushtlb.h | 87 +++++++++++++++++++++
 xen/arch/arm/include/asm/arm64/flushtlb.h | 77 +++++++++++++++++++
 xen/arch/arm/include/asm/cpregs.h         |  4 +
 xen/arch/arm/include/asm/cpufeature.h     | 27 ++++++-
 xen/arch/arm/include/asm/mmu/p2m.h        |  2 +
 xen/arch/arm/include/asm/processor.h      | 10 +++
 xen/arch/arm/mmu/p2m.c                    | 92 +++++++++++++++++------
 8 files changed, 302 insertions(+), 28 deletions(-)

-- 
2.43.0


Reply via email to