# HG changeset patch # User Praveen Tiwari # Date 1383909804 -19800 # Node ID b7ad918d8cb4793e9dd28837fc7ceda452284bfc # Parent 43e537f064b98e20b7dec87fecc831e298f5b661 blockcopy_sp_8x6, optimized asm code
diff -r 43e537f064b9 -r b7ad918d8cb4 source/common/x86/blockcopy8.asm --- a/source/common/x86/blockcopy8.asm Fri Nov 08 16:40:28 2013 +0530 +++ b/source/common/x86/blockcopy8.asm Fri Nov 08 16:53:24 2013 +0530 @@ -1143,36 +1143,31 @@ ; void blockcopy_sp_8x6(pixel *dest, intptr_t destStride, int16_t *src, intptr_t srcStride) ;----------------------------------------------------------------------------- INIT_XMM sse2 -cglobal blockcopy_sp_8x6, 4, 5, 7, dest, destStride, src, srcStride +cglobal blockcopy_sp_8x6, 4, 4, 6, dest, destStride, src, srcStride add r3, r3 -mova m0, [tab_Vm] +movu m0, [r2] +movu m1, [r2 + r3] +movu m2, [r2 + 2 * r3] +lea r2, [r2 + 2 * r3] +movu m3, [r2 + r3] +movu m4, [r2 + 2 * r3] +lea r2, [r2 + 2 * r3] +movu m5, [r2 + r3] -movu m1, [r2] -movu m2, [r2 + r3] -movu m3, [r2 + 2 * r3] -lea r4, [r2 + 2 * r3] -movu m4, [r4 + r3] -movu m5, [r4 + 2 * r3] -lea r4, [r4 + 2 * r3] -movu m6, [r4 + r3] +packuswb m0, m1 +packuswb m2, m3 +packuswb m4, m5 -pshufb m1, m0 -pshufb m2, m0 -pshufb m3, m0 -pshufb m4, m0 -pshufb m5, m0 -pshufb m6, m0 - -movh [r0], m1 -movh [r0 + r1], m2 -movh [r0 + 2 * r1], m3 -lea r4, [r0 + 2 * r1] -movh [r4 + r1], m4 -movh [r4 + 2 * r1], m5 -lea r4, [r4 + 2 * r1] -movh [r4 + r1], m6 +movlps [r0], m0 +movhps [r0 + r1], m0 +movlps [r0 + 2 * r1], m2 +lea r0, [r0 + 2 * r1] +movhps [r0 + r1], m2 +movlps [r0 + 2 * r1], m4 +lea r0, [r0 + 2 * r1] +movhps [r0 + r1], m4 RET _______________________________________________ x265-devel mailing list [email protected] https://mailman.videolan.org/listinfo/x265-devel
