On Fri, Oct 11, 2013 at 10:39 PM, chen <[email protected]> wrote: > > 在 2013-10-12 03:12:46,"Steve Borho" <[email protected]> 写道: > > > > > On Fri, Oct 11, 2013 at 3:40 AM, <[email protected]> wrote: > >> # HG changeset patch >> # User Yuvaraj Venkatesh <[email protected]> >> # Date 1381480768 -19800 >> # Fri Oct 11 14:09:28 2013 +0530 >> # Node ID 46b954edb1c52a557b9d94c4ed380ea0578c1949 >> # Parent 8bb743458331d7cdc1008e217542e406818c5a7a >> dct: Replaced partialButterfly16 vector class function to intrinsic >> > > For some reason, this new version is 3x slower than the vector version; we > need to figure out why. It looks like the code-flow is the same. > > are you use VS compiler? the instruction _mm_setr_epi32 is very slow on > it, most time vector version make constant array. > > Yes, indeed. What should they use instead of _mm_setr_epi32?
-- Steve Borho
_______________________________________________ x265-devel mailing list [email protected] https://mailman.videolan.org/listinfo/x265-devel
