Yes, That is right we don't need to pass width and height but that is helpfully for now in some cases to know which parameter I need to load from stack but while finally integrating to Test Bench I will replace with new one.
Regards, Praveen On Tue, Oct 8, 2013 at 11:13 AM, Steve Borho <[email protected]> wrote: > > > > On Tue, Oct 8, 2013 at 12:40 AM, <[email protected]> wrote: > >> # HG changeset patch >> # User praveen Tiwari >> # Date 1381210681 -19800 >> # Node ID d9580d9cad8df6ad00d08ab538cdac27c0eb4e92 >> # Parent d71078917df01e92605158a13b45ab35ee7cfc1c >> filterHorizontal_p_p_4, 4x4 asm code >> >> diff -r d71078917df0 -r d9580d9cad8d source/common/x86/ipfilter8.asm >> --- a/source/common/x86/ipfilter8.asm Mon Oct 07 12:48:32 2013 +0530 >> +++ b/source/common/x86/ipfilter8.asm Tue Oct 08 11:08:01 2013 +0530 >> @@ -130,3 +130,56 @@ >> RET >> >> %endif ; ARCH_X86_64 == 0 >> + >> +SECTION_RODATA 32 >> +tab_Tm: db 0, 1, 2, 3, 1, 2, 3, 4, 2, 3, 4, 5, 3, 4, 5, 6 >> + >> +tab_c_512: times 8 dw 512 >> + >> +SECTION .text >> + >> +%macro FILTER_H4_w4 3 >> + movu %1, [srcq - 1] >> + pshufb %2, %1, Tm0 >> + pmaddubsw %2, coef2 >> + phaddw %2, %1 >> + pmulhrsw %2, %3 >> + packuswb %2, %2 >> +%endmacro >> + >> +%macro FILTER_H4_w4_CALL 0 >> + FILTER_H4_w4 x0, x1, x2 >> + >> + movd [dstq], x1 >> + >> + add srcq, srcstrideq >> + add dstq, dststrideq >> +%endmacro >> + >> >> +;----------------------------------------------------------------------------- >> +; void filterHorizontal_p_p_4(pixel *src, intptr_t srcStride, pixel >> *dst, intptr_t dstStride, int width, int height, short const *coeff) >> > > I assume this comment is out of date? you shouldn't need to pass width > and height > > >> >> +;----------------------------------------------------------------------------- >> +INIT_XMM sse4 >> +cglobal filterHorizontal_p_p_4, 4, 5, 5, src, srcstride, dst, dststride >> +%define coef2 m4 >> +%define Tm0 m3 >> +%define x2 m2 >> +%define x1 m1 >> +%define x0 m0 >> + >> + mov r4, r6m >> + movu coef2, [r4] >> + packsswb coef2, coef2 >> + pshufd coef2, coef2, 0 >> + >> + mova x2, [tab_c_512] >> + >> + mova Tm0, [tab_Tm] >> + >> +%rep 3 >> +FILTER_H4_w4_CALL >> +%endrep >> + >> + FILTER_H4_w4 x0, x1, x2 >> + movd [dstq], x1 >> + RET >> _______________________________________________ >> x265-devel mailing list >> [email protected] >> https://mailman.videolan.org/listinfo/x265-devel >> > > > > -- > Steve Borho > > _______________________________________________ > x265-devel mailing list > [email protected] > https://mailman.videolan.org/listinfo/x265-devel > >
_______________________________________________ x265-devel mailing list [email protected] https://mailman.videolan.org/listinfo/x265-devel
