On 08/03/2014 07:52 PM, Steve Borho wrote:
On 08/02, dave wrote:
A few other things...
In my testing of encoding a single frame where I only examined processing of
the first few CUs this search method always found the lowest cost but cost
values never followed a consistent curve with a single low point. Without
further testing or deeper knowledge of the angle intra modes I wouldn't
guarantee the lowest cost will always be found.
It's understood there will be a trade-off between compression and
performance; so this is ok. It would be interesting to measure the
average "error" or difference in cost between the lowest cost found by
the fast-scan and the lowest cost found by the exhaustive scan. If
nothing else we can advertise this number in the documentation for the
option.
Also, unfortunately my system is old and doesn't support sse4 and that is
the only level of assembler supported for intra mode predictions so I was
only able to develop and test for the c primitives. I need to upgrade...
that's unfortunate; x265 really prefers SSE4.
Yes, there was a small performance increase that seemed larger when encoding
a single frame than a short video. Neither produced consistent results on
my system but the new search method was slightly faster most of the time
when encoding a single frame and always faster when encoding a short video.
Looking at the code, I had forgotten about the details of picking
filtered or unfiltered sources based on the size and angle. This is
something else that the 'all-angs' function does for you implicitly.
all-angs uses a lookup table in intrapred.cpp to determine filtering.
If the table were more publicly available that would help speed things
up. Also, the table and Predict::filteringIntraReferenceSamples don't
return the same results for all possible values.
I'm pretty sure the fast scan would be faster if it ran the 'all-angs'
function and then only measured satd/sa8d in the same pattern you have
now. The only gotchya will be to make sure you compare with transposed
source pixels for those modes that require it.
On 08/01/2014 09:50 PM, Steve Borho wrote:
On 08/01, dave wrote:
I am submitting a patch to implement the faster intra search suggested by
Steve Borho here:
https://mailman.videolan.org/pipermail/x265-devel/2014-July/004873.html
nice!
The patch is implements the faster search in slicetype.cpp. The same
approach can also be used in analysis.cpp but it's a little more than a
simple cut and paste though it shouldn't take long.
yep
TEncSearch.cpp also calls intra_pred_allangs for which this is the faster
alternative but at first look, TEncSearch.cpp is more complex. Depending on
what is desired, either this search could greatly simplify this part of
TEncSearch.cpp or it might not be applicable to what TEncSearch.cpp is
doing. I will be looking into it.
it should also apply here; this function is a little more complicated
because it keeps a "best N" list of modes, and then performs
rate-distortion measurements (encodes each option and records the actual
distortion and bit cost) to select the final intra mode. The same 'fast
scan' mode could be used to build the 'best N' list. But I imagine most
presets that use this RDO version of intra will not want a fast scan.
_______________________________________________
x265-devel mailing list
[email protected]
https://mailman.videolan.org/listinfo/x265-devel
_______________________________________________
x265-devel mailing list
[email protected]
https://mailman.videolan.org/listinfo/x265-devel