Hi George,



Thank for the improve patch.

I just a little comments below,




At 2025-03-08 00:41:05, "George Steed" <george.st...@arm.com> wrote:
> source/common/aarch64/pixel-util.S | 94 +++++++++++++-----------------
> 1 file changed, 42 insertions(+), 52 deletions(-)
>
>diff --git a/source/common/aarch64/pixel-util.S 
>b/source/common/aarch64/pixel-util.S
>index d8b3f4365..6635e52b1 100644
>--- a/source/common/aarch64/pixel-util.S
>+++ b/source/common/aarch64/pixel-util.S
>@@ -2213,27 +2213,25 @@ endfunc
> //     const uint16_t* scanCG4x4, // x6
> //     const int trSize)          // x7
> function PFX(scanPosLast_neon)
>-.Loop_spl:
>-    // position of current CG
>+    ldr             q28, [x10]              // v28 = mask for pmovmskb
>+    add             x10, x7, x7             // 2*x7
>+    add             x11, x7, x7, lsl #1     // 3*x7
>+    add             x9, x4, #1              // CG count
>+

>+1:
This is GCC style label, please keep generic style of local label




>     // coeffFlag = reverse_bit(w15) in 16-bit
>-    rbit            w12, w15
>-    lsr             w12, w12, #16
>-    fmov            s30, w12
>+    rbit            w12, w13

>+    and             w12, w12, #0xffff
Is this necessary?


>     strh            w12, [x3], #2

> 
>-    // compute coeffNum = popcount(coeffFlag)
>-    cnt             v30.8b, v30.8b
>-    addp            v30.8b, v30.8b, v30.8b
>-    fmov            w6, s30

>-    sub             x5, x5, x6
We are not need 64bits x5


>-    strb            w6, [x4], #1
>-
>-    cbnz            x5, .Loop_spl

>+    cbnz            x5, 1b
Same x5 here

_______________________________________________
x265-devel mailing list
x265-devel@videolan.org
https://mailman.videolan.org/listinfo/x265-devel

Reply via email to