The patch looks good, no more modify necessary, thanks.
btw: you didn't see change with CBNZ, I guess two reasons, one is 'sub x9' too is in first part of loop, I more likely move these independent instruction fill into pipeline stall slots, the second is count of loop is not many enough since this is small function. At 2021-06-24 10:34:02, "Pop, Sebastian" <s...@amazon.com> wrote: Also added cbnz, no perf change.
_______________________________________________ x265-devel mailing list x265-devel@videolan.org https://mailman.videolan.org/listinfo/x265-devel