>- T00 = _mm_unpacklo_epi16(T00, _mm_setzero_si128()); >+ __m128i sign = _mm_srai_epi16(T00, 15); >+ T00 = _mm_unpacklo_epi16(T00, sign); > T01 = _mm_unpacklo_epi8(T01, _mm_setzero_si128());
I guess pmovsxwd is faster
_______________________________________________ x265-devel mailing list [email protected] https://mailman.videolan.org/listinfo/x265-devel
