At 2014-09-29 16:47:45,[email protected] wrote:
># HG changeset patch
># User Praveen Tiwari
># Date 1411980445 -19800
># Node ID 9a8552ea378500baa21b89b24d8aec99acf7cce2
># Parent  32f50df7fa7672f4c1818ddf3165b4bd243e0b10
>blockfill_s_16x16 avx2 asm code, performance improved 389.21 cycles -> 204.38 
>cycles
>
>diff -r 32f50df7fa76 -r 9a8552ea3785 source/common/x86/blockcopy8.asm
>--- a/source/common/x86/blockcopy8.asm Fri Sep 26 17:33:09 2014 -0500
>+++ b/source/common/x86/blockcopy8.asm Mon Sep 29 14:17:25 2014 +0530
>@@ -1826,6 +1826,38 @@
> 
> BLOCKFILL_S_W16_H8 16, 16
> 
>+INIT_YMM avx2
>+cglobal blockfill_s_16x16, 3, 4, 1
>+add        r1, r1
>+lea        r3, [3 * r1]
>+
>+movd       xm0, r2d
>+pshuflw    xm0, xm0, 0
>+pshufd     xm0, xm0, 0
>+
>+vinserti128 m0, m0, xm0, 1

vpbroadcastd
 
 
_______________________________________________
x265-devel mailing list
[email protected]
https://mailman.videolan.org/listinfo/x265-devel

Reply via email to