> [MC] Excuse me, I think it is > db -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0
Right, I messed up my endianness. > To implement this change , we need to modify HM code. > [MC] we can define the table in asm file, but we have to modify HM. of > course, it is easy things You don't have to, of course (you know the code better than I and whether or not it's a good idea to change it). >> + >> + mov tmp, offset2 >> + movd sumOffset, tmp >> + pshufd sumOffset, sumOffset, 0 > > You can movd directly from memory; going through a register is much > slower, especially on AMD machines. > [MC] are you means, we put constant into memory and load it once? movd sumOffset, offset2 > [MC] no way, x264 macro have a bug here, you can remove reduce x2 and check > the output, the xmm0 seems Intel limit That makes sense, I don't think the x264 macro was ever designed to support non-AVX pblendvb. I don't recommend non-AVX pblendvb anyways as it's a lot slower because of the extra register dependency (it's like 3 uops or something). Jason _______________________________________________ x265-devel mailing list [email protected] https://mailman.videolan.org/listinfo/x265-devel
