> move SUB follow by LD1 will hidden memory operator latency,

Thanks, that helped a little bit:

Before:
        scale2D_64to32  86.83x  158.42        13756.12

After:
        scale2D_64to32  87.00x  158.20          13764.38

Added to the patch.

Attachment: 0001-arm64-port-scale1D_128to64-and-scale2D_64to32.patch
Description: 0001-arm64-port-scale1D_128to64-and-scale2D_64to32.patch

_______________________________________________
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel

Reply via email to