At 2015-03-11 12:57:34,"Praveen Tiwari" <[email protected]> wrote:



---------- Forwarded message ----------
From: chen<[email protected]>
Date: Wed, Mar 11, 2015 at 6:32 AM
Subject: Re: [x265] [PATCH] asm: intra_pred_ang16_2
To: Development for x265 <[email protected]>



>>same speed to old version


This avx2 version of asm code eliminates following instruction on cost of one 
vextracti128 instruction as compare to SSEE3, may not be a visible impact in 
testBench but seems worth to push.  
    add             r2, 34
    cmp             r3m, byte 34
    cmove           r2, r4
[MC] above for share code on mode 2 & 34, your new code use seprate functions, 
and vextract will use Port5, it is common bottleneck
 
    movu            m1, [r2 + 16]

_______________________________________________
x265-devel mailing list
[email protected]
https://mailman.videolan.org/listinfo/x265-devel

Reply via email to