---------- Forwarded message ---------- From: chen <[email protected]> Date: Fri, Nov 8, 2013 at 3:29 PM Subject: Re: [x265] [PATCH] blockcopy_sp_4x8, optimized asm code To: Development for x265 <[email protected]>
At 2013-11-08 17:34:19,[email protected] wrote: ># HG changeset patch ># User Praveen Tiwari ># Date 1383903250 -19800 ># Node ID 1e6bf52b6e3471b81e636569daa667f6dec9838a ># Parent 44ac213169c906eab5cba6b4aba876391b81da99 >blockcopy_sp_4x8, optimized asm code > >diff -r 44ac213169c9 -r 1e6bf52b6e34 source/common/x86/blockcopy8.asm >--- a/source/common/x86/blockcopy8.asm Fri Nov 08 14:46:07 2013 +0530 >+++ b/source/common/x86/blockcopy8.asm Fri Nov 08 15:04:10 2013 +0530 >@@ -948,45 +948,42 @@ > ; void blockcopy_sp_4x8(pixel *dest, intptr_t destStride, int16_t *src, > intptr_t srcStride) > ;----------------------------------------------------------------------------- > INIT_XMM sse2 >-cglobal blockcopy_sp_4x8, 4, 6, 8, dest, destStride, src, srcStride >+cglobal blockcopy_sp_4x8, 4, 4, 8, dest, destStride, src, srcStride >>you have used r5 Min, r5 was in old code I have removed that. I think you are talking about [ -lea r5, [r4 + 2 * r3] ]. In new code I have used just 4 registers. _______________________________________________ x265-devel mailing list [email protected] https://mailman.videolan.org/listinfo/x265-devel
_______________________________________________ x265-devel mailing list [email protected] https://mailman.videolan.org/listinfo/x265-devel
