Hi, This patch series adds further optimised implementations of the ipfilter primitives, using Armv8.4 Neon DotProd and Armv8.6 Neon I8MM instructions.
Relative performance numbers are in the individual commit messages. The series is based on the x265_git master branch. Many thanks, Hari George Steed (1): testbench.cpp: Guard extensions based on architecture Hari Limaye (13): AArch64: Add Armv8.4 Neon DotProd implementations of luma_hpp AArch64: Add Armv8.4 Neon DotProd implementations of luma_hps AArch64: Add Armv8.4 Neon DotProd implementations of filter_hpp AArch64: Add Armv8.4 Neon DotProd implementations of filter_hps AArch64: Add Armv8.4 Neon DotProd implementation of interp_hv_pp AArch64: Add Armv8.6 Neon I8MM feature detection AArch64: Add Armv8.6 Neon I8MM implementations of luma_hpp AArch64: Add Armv8.6 Neon I8MM implementations of luma_hps AArch64: Add Armv8.6 Neon I8MM implementations of chroma_hpp AArch64: Add Armv8.6 Neon I8MM implementation of interp_hv_pp AArch64: Add Armv8.4 Neon DotProd implementations of luma_vps AArch64: Add Armv8.6 Neon I8MM implementations of luma_vps AArch64: Add Armv8.6 Neon I8MM implementations of luma_vpp build/README.txt | 23 +- source/CMakeLists.txt | 32 +- source/cmake/FindNEON_I8MM.cmake | 21 + source/common/CMakeLists.txt | 14 + source/common/aarch64/asm-primitives.cpp | 14 + source/common/aarch64/filter-neon-dotprod.cpp | 1131 +++++++++++++ source/common/aarch64/filter-neon-dotprod.h | 37 + source/common/aarch64/filter-neon-i8mm.cpp | 1412 +++++++++++++++++ source/common/aarch64/filter-neon-i8mm.h | 37 + source/common/aarch64/mem-neon.h | 16 + source/common/cpu.cpp | 18 +- source/test/testbench.cpp | 4 + source/x265.h | 1 + 13 files changed, 2742 insertions(+), 18 deletions(-) create mode 100644 source/cmake/FindNEON_I8MM.cmake create mode 100644 source/common/aarch64/filter-neon-dotprod.cpp create mode 100644 source/common/aarch64/filter-neon-dotprod.h create mode 100644 source/common/aarch64/filter-neon-i8mm.cpp create mode 100644 source/common/aarch64/filter-neon-i8mm.h -- 2.42.1 _______________________________________________ x265-devel mailing list x265-devel@videolan.org https://mailman.videolan.org/listinfo/x265-devel