Hello, This is the v3 patch series to solve Chen's review comments. A comment on your suggestions: I continued using the TBL instruction, as the REV64 instruction does not offer any performance advantage. Additionally, this approach helps maintain code consistency and improves readability.
Best regards, Gerda Gerda Zsejke More (4): AArch64: Add SVE implementation of HBD interp_horiz_pp AArch64: Add SVE implementation of HBD interp_horiz_ps AArch64: Add SVE implementation of HBD interp_vert_ss AArch64: Add SVE implementation of HBD interp_vert_pp source/common/CMakeLists.txt | 2 +- source/common/aarch64/asm-primitives.cpp | 2 + source/common/aarch64/filter-prim-sve.cpp | 1057 +++++++++++++++++++++ source/common/aarch64/filter-prim-sve.h | 37 + source/common/aarch64/neon-sve-bridge.h | 12 + 5 files changed, 1109 insertions(+), 1 deletion(-) create mode 100644 source/common/aarch64/filter-prim-sve.cpp create mode 100644 source/common/aarch64/filter-prim-sve.h -- 2.39.5 (Apple Git-154) _______________________________________________ x265-devel mailing list x265-devel@videolan.org https://mailman.videolan.org/listinfo/x265-devel