At 2014-09-29 16:47:45,[email protected] wrote: ># HG changeset patch ># User Praveen Tiwari ># Date 1411980445 -19800 ># Node ID 9a8552ea378500baa21b89b24d8aec99acf7cce2 ># Parent 32f50df7fa7672f4c1818ddf3165b4bd243e0b10 >blockfill_s_16x16 avx2 asm code, performance improved 389.21 cycles -> 204.38 >cycles > >diff -r 32f50df7fa76 -r 9a8552ea3785 source/common/x86/blockcopy8.asm >--- a/source/common/x86/blockcopy8.asm Fri Sep 26 17:33:09 2014 -0500 >+++ b/source/common/x86/blockcopy8.asm Mon Sep 29 14:17:25 2014 +0530 >@@ -1826,6 +1826,38 @@ > > BLOCKFILL_S_W16_H8 16, 16 > >+INIT_YMM avx2 >+cglobal blockfill_s_16x16, 3, 4, 1 >+add r1, r1 >+lea r3, [3 * r1] >+ >+movd xm0, r2d >+pshuflw xm0, xm0, 0 >+pshufd xm0, xm0, 0 >+ >+vinserti128 m0, m0, xm0, 1
vpbroadcastd
_______________________________________________ x265-devel mailing list [email protected] https://mailman.videolan.org/listinfo/x265-devel
