At 2014-12-12 00:40:01,"Steve Borho" <[email protected]> wrote: >On 12/11, Divya Manivannan wrote: >> # HG changeset patch >> # User Divya Manivannan <[email protected]> >> # Date 1418296477 -19800 >> # Thu Dec 11 16:44:37 2014 +0530 >> # Node ID 440d264fcdf33889b665848f19e87ca3559d1b6c >> # Parent 667e4ea0899fcf026ee9df935381487d3148ed0c >> integrate assembly code for psyCost_pp >> >> diff -r 667e4ea0899f -r 440d264fcdf3 source/common/pixel.cpp >> --- a/source/common/pixel.cpp Thu Dec 11 09:36:16 2014 +0530 >> +++ b/source/common/pixel.cpp Thu Dec 11 16:44:37 2014 +0530 >> @@ -815,10 +815,11 @@ >> for (int j = 0; j < dim; j+= 8) >> { >> /* AC energy, measured by sa8d (AC + DC) minus SAD (DC) */ >> - int sourceEnergy = sa8d_8x8(source + i * sstride + j, >> sstride, zeroBuf, 0) - >> - (sad<8, 8>(source + i * sstride + j, >> sstride, zeroBuf, 0) >> 2); >> - int reconEnergy = sa8d_8x8(recon + i * rstride + j, >> rstride, zeroBuf, 0) - >> - (sad<8, 8>(recon + i * rstride + j, >> rstride, zeroBuf, 0) >> 2); >> + // PartitionFromSizes(8, 8) = 1 >> + int sourceEnergy = primitives.sa8d[1](source + i * sstride >> + j, sstride, zeroBuf, 0) - >> + (primitives.sad[1](source + i * sstride >> + j, sstride, zeroBuf, 0) >> 2); >> + int reconEnergy = primitives.sa8d[1](recon + i * rstride + >> j, rstride, zeroBuf, 0) - >> + (primitives.sad[1](recon + i * rstride + >> j, rstride, zeroBuf, 0) >> 2); > >This is an improvement over just C code, but it is still vastly slower >than writing new assembly functions for these. The function call >overhead is non-trivial. > It reuse same input, so we can avoid many load/store when we write a new function. >> >> totEnergy += abs(sourceEnergy - reconEnergy); >> } >> @@ -828,8 +829,11 @@ >> else >> { >> /* 4x4 is too small for sa8d */ >> - int sourceEnergy = satd_4x4(source, sstride, zeroBuf, 0) - (sad<4, >> 4>(source, sstride, zeroBuf, 0) >> 2); >> - int reconEnergy = satd_4x4(recon, rstride, zeroBuf, 0) - (sad<4, >> 4>(recon, rstride, zeroBuf, 0) >> 2); >> + // partitionFromSizes(4, 4) = 0 >> + int sourceEnergy = primitives.satd[0](source, sstride, zeroBuf, 0) - >> + (primitives.sad[0](source, sstride, zeroBuf, 0) >> >> 2); >> + int reconEnergy = primitives.satd[0](recon, rstride, zeroBuf, 0) - >> + (primitives.sad[0](recon, rstride, zeroBuf, 0) >> >> 2); >> return abs(sourceEnergy - reconEnergy); >> } >> } >> _______________________________________________ >> x265-devel mailing list >> [email protected] >> https://mailman.videolan.org/listinfo/x265-devel > >-- >Steve Borho >_______________________________________________ >x265-devel mailing list >[email protected] >https://mailman.videolan.org/listinfo/x265-devel
_______________________________________________ x265-devel mailing list [email protected] https://mailman.videolan.org/listinfo/x265-devel
