On 08/02, dave wrote: > A few other things... > > In my testing of encoding a single frame where I only examined processing of > the first few CUs this search method always found the lowest cost but cost > values never followed a consistent curve with a single low point. Without > further testing or deeper knowledge of the angle intra modes I wouldn't > guarantee the lowest cost will always be found.
It's understood there will be a trade-off between compression and performance; so this is ok. It would be interesting to measure the average "error" or difference in cost between the lowest cost found by the fast-scan and the lowest cost found by the exhaustive scan. If nothing else we can advertise this number in the documentation for the option. > Also, unfortunately my system is old and doesn't support sse4 and that is > the only level of assembler supported for intra mode predictions so I was > only able to develop and test for the c primitives. I need to upgrade... that's unfortunate; x265 really prefers SSE4. > Yes, there was a small performance increase that seemed larger when encoding > a single frame than a short video. Neither produced consistent results on > my system but the new search method was slightly faster most of the time > when encoding a single frame and always faster when encoding a short video. Looking at the code, I had forgotten about the details of picking filtered or unfiltered sources based on the size and angle. This is something else that the 'all-angs' function does for you implicitly. I'm pretty sure the fast scan would be faster if it ran the 'all-angs' function and then only measured satd/sa8d in the same pattern you have now. The only gotchya will be to make sure you compare with transposed source pixels for those modes that require it. > On 08/01/2014 09:50 PM, Steve Borho wrote: > >On 08/01, dave wrote: > >>I am submitting a patch to implement the faster intra search suggested by > >>Steve Borho here: > >> > >>https://mailman.videolan.org/pipermail/x265-devel/2014-July/004873.html > >nice! > > > >>The patch is implements the faster search in slicetype.cpp. The same > >>approach can also be used in analysis.cpp but it's a little more than a > >>simple cut and paste though it shouldn't take long. > >yep > > > >>TEncSearch.cpp also calls intra_pred_allangs for which this is the faster > >>alternative but at first look, TEncSearch.cpp is more complex. Depending on > >>what is desired, either this search could greatly simplify this part of > >>TEncSearch.cpp or it might not be applicable to what TEncSearch.cpp is > >>doing. I will be looking into it. > >it should also apply here; this function is a little more complicated > >because it keeps a "best N" list of modes, and then performs > >rate-distortion measurements (encodes each option and records the actual > >distortion and bit cost) to select the final intra mode. The same 'fast > >scan' mode could be used to build the 'best N' list. But I imagine most > >presets that use this RDO version of intra will not want a fast scan. > > > > _______________________________________________ > x265-devel mailing list > [email protected] > https://mailman.videolan.org/listinfo/x265-devel -- Steve Borho _______________________________________________ x265-devel mailing list [email protected] https://mailman.videolan.org/listinfo/x265-devel
