> +.macro filterPixelToShort_64xN h
> +function x265_filterPixelToShort_64x\h\()_neon
> +    add             x3, x3, x3
> +    sub             x3, x3, #0x40
> +    movi            v4.8h, #0xe0, lsl #8
> +.rept \h
> I guess unroll N is not good idea, because the code section too large, it 
> most probability to make cache flush and missing.

Please see attached the amended patch to include the loop.
Ok to commit?

Thanks,
Sebastian

Attachment: 0001-arm64-port-x265_filterPixelToShort_-_neon.patch
Description: 0001-arm64-port-x265_filterPixelToShort_-_neon.patch

_______________________________________________
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel

Reply via email to