Hi,
The code looks good. little performance change because pipeline stall, two of LD1 can't hidden latency penalty, but it is not big problem, we saved the code size. Could you please make a stalone patch, I guess patch to patch is not good idea. Regards, Min Chen At 2021-07-31 02:27:36, "Pop, Sebastian" <s...@amazon.com> wrote: A small change to save a few bytes in code size. I replaced the 4 LD1 2 regs with 2 LD1 4 regs. No performance change.
_______________________________________________ x265-devel mailing list x265-devel@videolan.org https://mailman.videolan.org/listinfo/x265-devel