At 2015-03-11 12:57:34,"Praveen Tiwari" <[email protected]> wrote:
---------- Forwarded message ---------- From: chen<[email protected]> Date: Wed, Mar 11, 2015 at 6:32 AM Subject: Re: [x265] [PATCH] asm: intra_pred_ang16_2 To: Development for x265 <[email protected]> >>same speed to old version This avx2 version of asm code eliminates following instruction on cost of one vextracti128 instruction as compare to SSEE3, may not be a visible impact in testBench but seems worth to push. add r2, 34 cmp r3m, byte 34 cmove r2, r4 [MC] above for share code on mode 2 & 34, your new code use seprate functions, and vextract will use Port5, it is common bottleneck movu m1, [r2 + 16]
_______________________________________________ x265-devel mailing list [email protected] https://mailman.videolan.org/listinfo/x265-devel
