Re: [x265] [PATCH rfc] aq: implementation of Fine-grained Adaptive Quantization

Gopu Govindaswamy Mon, 16 Mar 2015 22:20:03 -0700

Thanks for review

On Tue, Mar 17, 2015 at 4:27 AM, Steve Borho <[email protected]> wrote:


> On 03/16, [email protected] wrote:
> > # HG changeset patch
> > # User Gopu Govindaswamy <[email protected]>
> > # Date 1426504011 -19800
> > #      Mon Mar 16 16:36:51 2015 +0530
> > # Node ID 615b61dd2be5e8ef1a7fe2f22edcac6e437f300d
> > # Parent  6461985f33ac6fc5b205879bbb0f2a535226ca76
> > aq: implementation of Fine-grained Adaptive Quantization
>
> nit: prefer lower case for Fine
>
>
OK,


> > Currently adaptive quantization adjusts the QP values on 64x64 pixel
> coding tree
> > units (CTUs) across a video frame. the new param option --max-dqp-depth
> will
> > enable quantization parameter (QP) to be adjusted to individual
> quantization
> > groups (QGs)
> >
> > Example:
> > --max-dqp-depth=0 for 64x64 blocks
> > --max-dqp-depth=1 for 32x32 blocks
> > --max-dqp-depth=2 for 16x16 blocks
>
> what if --ctu is not 64?  This patch crashes about 1/3 of the smoke tests
>

Ok, this is my fault and i have added the validation in encoder configure
like

if (p->rc.maxCuDQPDepth > (int32_t)(g_maxCUDepth - 1))
  then setting default value(depth) for maxCuDQPDepth= 0

the current patch will support onlyCU size  64x64, 32x32 and 16x16
for example if --ctu=32 the maxCUDepth is 2 and we will support depth 0 and
1 i.e 32x32 and 16x16 same for --ctu=16


> > currently this feature not supported for block 8x8
> >
> > sample test results for each depth
> >
> > clip - ducks_take_off_420_720p50.y4m
> > preset=medium
> > max-dqp-depth 0 - encoded 500 frames in 36.86s (13.56 fps), 4575.09 kb/s,
> > Global PSNR: 29.587, SSIM Mean Y: 0.8309761 ( 7.721 dB)
> > max-dqp-depth 1 - encoded 500 frames in 43.00s (11.63 fps), 4606.96 kb/s,
> > Global PSNR: 29.590, SSIM Mean Y: 0.8313855 ( 7.731 dB)
> > max-dqp-depth 2 - encoded 500 frames in 35.47s (14.10 fps), 4599.65 kb/s,
> > Global PSNR: 29.575, SSIM Mean Y: 0.8311820 ( 7.726 dB)
> >
> > preset=veryslow
> > max-dqp-depth 0 - encoded 500 frames in 499.24s (1.00 fps), 4407.79 kb/s,
> > Global PSNR: 29.890, SSIM Mean Y: 0.8419664 ( 8.013 dB)
> > max-dqp-depth 1 - encoded 500 frames in 497.96s (1.00 fps),
> > 4413.64 kb/s, Global PSNR: 29.884, SSIM Mean Y: 0.8420085 ( 8.014 dB)
> > max-dqp-depth 2 - encoded 500 frames in 511.36s (0.98 fps), 4428.71 kb/s,
> > Global PSNR: 29.877, SSIM Mean Y: 0.8419621 ( 8.012 dB)
> >
> > -----------------------------------------
> > clip - Cactus_1920x1080_50.y4m
> > preset=medium
> > max-dqp-depth 0 - encoded 100 frames in 13.61s (7.35 fps), 2588.25 kb/s,
> > Global PSNR: 34.890, SSIMMean Y: 0.8685867 ( 8.814 dB)
> > max-dqp-depth 1 - encoded 100 frames in 12.15s (8.23 fps), 2629.22 kb/s,
> > Global PSNR: 34.901, SSIMMean Y: 0.8689989 ( 8.827 dB)
> > max-dqp-depth 2 - encoded 100 frames in 12.26s (8.16 fps), 2624.31 kb/s,
> > Global PSNR: 34.864, SSIMMean Y: 0.8688061 ( 8.821 dB)
> >
> > preset=veryslow
> > max-dqp-depth 0 - encoded 100 frames in 138.68s (0.72 fps), 2277.00 kb/s,
> > Global PSNR: 35.118, SSIM Mean Y: 0.8725818 ( 8.948 dB)
> > max-dqp-depth 1 - encoded 100 frames in 137.21s (0.73 fps), 2293.83 kb/s,
> > Global PSNR: 35.117, SSIM Mean Y: 0.8725589 ( 8.947 dB)
> > max-dqp-depth 2 - encoded 100 frames in 134.96s (0.74 fps), 2299.79 kb/s,
> > Global PSNR: 35.109, SSIM Mean Y: 0.8727326 ( 8.953 dB)
>
> this doesn't tell us much; I think the most compelling change from this
> commit will be in visual quality that is not easily measured.
>

OK,


>
> > diff -r 6461985f33ac -r 615b61dd2be5 source/common/cudata.cpp
> > --- a/source/common/cudata.cpp        Sun Mar 15 11:58:32 2015 -0500
> > +++ b/source/common/cudata.cpp        Mon Mar 16 16:36:51 2015 +0530
> > @@ -298,7 +298,7 @@
> >  }
> >
> >  // initialize Sub partition
> > -void CUData::initSubCU(const CUData& ctu, const CUGeom& cuGeom)
> > +void CUData::initSubCU(const CUData& ctu, const CUGeom& cuGeom, const
> int qp)
> >  {
> >      m_absIdxInCTU   = cuGeom.absPartIdx;
> >      m_encData       = ctu.m_encData;
> > @@ -312,8 +312,11 @@
> >      m_cuAboveRight  = ctu.m_cuAboveRight;
> >      X265_CHECK(m_numPartitions == cuGeom.numPartitions, "initSubCU()
> size mismatch\n");
> >
> > -    /* sequential memsets */
> > -    m_partSet((uint8_t*)m_qp, (uint8_t)ctu.m_qp[0]);
> > +    if (cuGeom.depth <= (uint32_t)m_encData->m_param->rc.maxCuDQPDepth)
> > +        m_partSet((uint8_t*)m_qp, (uint8_t)qp);
> > +    else
> > +        m_partSet((uint8_t*)m_qp, (uint8_t)ctu.m_qp[0]);
> > +
> >      m_partSet(m_log2CUSize,   (uint8_t)cuGeom.log2CUSize);
> >      m_partSet(m_lumaIntraDir, (uint8_t)DC_IDX);
> >      m_partSet(m_tqBypass,     (uint8_t)m_encData->m_param->bLossless);
> > diff -r 6461985f33ac -r 615b61dd2be5 source/common/cudata.h
> > --- a/source/common/cudata.h  Sun Mar 15 11:58:32 2015 -0500
> > +++ b/source/common/cudata.h  Mon Mar 16 16:36:51 2015 +0530
> > @@ -182,7 +182,7 @@
> >      static void calcCTUGeoms(uint32_t ctuWidth, uint32_t ctuHeight,
> uint32_t maxCUSize, uint32_t minCUSize, CUGeom
> cuDataArray[CUGeom::MAX_GEOMS]);
> >
> >      void     initCTU(const Frame& frame, uint32_t cuAddr, int qp);
> > -    void     initSubCU(const CUData& ctu, const CUGeom& cuGeom);
> > +    void     initSubCU(const CUData& ctu, const CUGeom& cuGeom, const
> int qp);
> >      void     initLosslessCU(const CUData& cu, const CUGeom& cuGeom);
> >
> >      void     copyPartFrom(const CUData& cu, const CUGeom& childGeom,
> uint32_t subPartIdx);
> > diff -r 6461985f33ac -r 615b61dd2be5 source/common/param.cpp
> > --- a/source/common/param.cpp Sun Mar 15 11:58:32 2015 -0500
> > +++ b/source/common/param.cpp Mon Mar 16 16:36:51 2015 +0530
> > @@ -210,6 +210,7 @@
> >      param->rc.zones = NULL;
> >      param->rc.bEnableSlowFirstPass = 0;
> >      param->rc.bStrictCbr = 0;
> > +    param->rc.maxCuDQPDepth = 0;
> >
> >      /* Video Usability Information (VUI) */
> >      param->vui.aspectRatioIdc = 0;
> > @@ -839,6 +840,7 @@
> >      OPT2("pools", "numa-pools") p->numaPools = strdup(value);
> >      OPT("lambda-file") p->rc.lambdaFileName = strdup(value);
> >      OPT("analysis-file") p->analysisFileName = strdup(value);
> > +    OPT("max-dqp-depth") p->rc.maxCuDQPDepth = atoi(value);
> >      else
> >          return X265_PARAM_BAD_NAME;
> >  #undef OPT
> > diff -r 6461985f33ac -r 615b61dd2be5 source/common/quant.cpp
> > --- a/source/common/quant.cpp Sun Mar 15 11:58:32 2015 -0500
> > +++ b/source/common/quant.cpp Mon Mar 16 16:36:51 2015 +0530
> > @@ -225,13 +225,13 @@
> >      X265_FREE(m_fencShortBuf);
> >  }
> >
> > -void Quant::setQPforQuant(const CUData& cu)
> > +void Quant::setQPforQuant(const CUData& cu, const int qp)
> >  {
> >      m_tqBypass = !!cu.m_tqBypass[0];
> >      if (m_tqBypass)
> >          return;
> >      m_nr = m_frameNr ? &m_frameNr[cu.m_encData->m_frameEncoderID] :
> NULL;
> > -    int qpy = cu.m_qp[0];
> > +    int qpy = qp ? qp : cu.m_qp[0];
> >      m_qpParam[TEXT_LUMA].setQpParam(qpy + QP_BD_OFFSET);
> >      setChromaQP(qpy + cu.m_slice->m_pps->chromaQpOffset[0],
> TEXT_CHROMA_U, cu.m_chromaFormat);
> >      setChromaQP(qpy + cu.m_slice->m_pps->chromaQpOffset[1],
> TEXT_CHROMA_V, cu.m_chromaFormat);
> > diff -r 6461985f33ac -r 615b61dd2be5 source/common/quant.h
> > --- a/source/common/quant.h   Sun Mar 15 11:58:32 2015 -0500
> > +++ b/source/common/quant.h   Mon Mar 16 16:36:51 2015 +0530
> > @@ -103,7 +103,7 @@
> >      bool allocNoiseReduction(const x265_param& param);
> >
> >      /* CU setup */
> > -    void setQPforQuant(const CUData& cu);
> > +    void setQPforQuant(const CUData& cu, const int qp = 0);
>
> I strongly dislike default values in C++ code. Also 'const int' is
> redundant for integer arguments.
>

OK , insted setting the default value i will pass this as a argument in
function call


> >      uint32_t transformNxN(const CUData& cu, const pixel* fenc, uint32_t
> fencStride, const int16_t* residual, uint32_t resiStride, coeff_t* coeff,
> >                            uint32_t log2TrSize, TextType ttype, uint32_t
> absPartIdx, bool useTransformSkip);
> > diff -r 6461985f33ac -r 615b61dd2be5 source/encoder/analysis.cpp
> > --- a/source/encoder/analysis.cpp     Sun Mar 15 11:58:32 2015 -0500
> > +++ b/source/encoder/analysis.cpp     Mon Mar 16 16:36:51 2015 +0530
> > @@ -225,6 +225,10 @@
> >      bool mightSplit = !(cuGeom.flags & CUGeom::LEAF);
> >      bool mightNotSplit = !(cuGeom.flags & CUGeom::SPLIT_MANDATORY);
> >
> > +    int32_t qp = 0;
> > +    if (depth <= (uint32_t)m_param->rc.maxCuDQPDepth)
> > +        qp = calculateQpforCuSize(parentCTU, cuGeom);
> > +
> >      if (m_param->analysisMode == X265_ANALYSIS_LOAD)
> >      {
> >          uint8_t* reuseDepth  =
> &m_reuseIntraDataCTU->depth[parentCTU.m_cuAddr * parentCTU.m_numPartitions];
> > @@ -234,11 +238,11 @@
> >
> >          if (mightNotSplit && depth == reuseDepth[zOrder] && zOrder ==
> cuGeom.absPartIdx)
> >          {
> > -            m_quant.setQPforQuant(parentCTU);
> > +            m_quant.setQPforQuant(parentCTU, qp);
> >
> >              PartSize size = (PartSize)reusePartSizes[zOrder];
> >              Mode& mode = size == SIZE_2Nx2N ? md.pred[PRED_INTRA] :
> md.pred[PRED_INTRA_NxN];
> > -            mode.cu.initSubCU(parentCTU, cuGeom);
> > +            mode.cu.initSubCU(parentCTU, cuGeom, qp);
> >              checkIntra(mode, cuGeom, size, &reuseModes[zOrder],
> &reuseChromaModes[zOrder]);
> >              checkBestMode(mode, depth);
> >
> > @@ -255,15 +259,15 @@
> >      }
> >      else if (mightNotSplit)
> >      {
> > -        m_quant.setQPforQuant(parentCTU);
> > +        m_quant.setQPforQuant(parentCTU, qp);
>
> this seems wrong in general; shouldn't quant always use the QP set in
> the CU data structure? it seems like we should be ensuring
> m_quant.setQPforQuant() is always called after configuring the QP for
> the CU.
>

we have a different qp for each CU size based on the --max-dqp-depth, in
this case
when the qp is change for any CU size then need to configure this same QP
for Quant also,


>
> > -        md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom);
> > +        md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom, qp);
> >          checkIntra(md.pred[PRED_INTRA], cuGeom, SIZE_2Nx2N, NULL, NULL);
> >          checkBestMode(md.pred[PRED_INTRA], depth);
> >
> >          if (cuGeom.log2CUSize == 3 &&
> m_slice->m_sps->quadtreeTULog2MinSize < 3)
> >          {
> > -            md.pred[PRED_INTRA_NxN].cu.initSubCU(parentCTU, cuGeom);
> > +            md.pred[PRED_INTRA_NxN].cu.initSubCU(parentCTU, cuGeom, qp);
> >              checkIntra(md.pred[PRED_INTRA_NxN], cuGeom, SIZE_NxN, NULL,
> NULL);
> >              checkBestMode(md.pred[PRED_INTRA_NxN], depth);
> >          }
> > @@ -280,7 +284,7 @@
> >          Mode* splitPred = &md.pred[PRED_SPLIT];
> >          splitPred->initCosts();
> >          CUData* splitCU = &splitPred->cu;
> > -        splitCU->initSubCU(parentCTU, cuGeom);
> > +        splitCU->initSubCU(parentCTU, cuGeom, qp);
> >
> >          uint32_t nextDepth = depth + 1;
> >          ModeDepth& nd = m_modeDepth[nextDepth];
> > @@ -496,6 +500,10 @@
> >
> >      X265_CHECK(m_param->rdLevel >= 2, "compressInterCU_dist does not
> support RD 0 or 1\n");
> >
> > +    int32_t qp = 0;
> > +    if (depth <= (uint32_t)m_param->rc.maxCuDQPDepth)
> > +        qp = calculateQpforCuSize(parentCTU, cuGeom);
> > +
> >      if (mightNotSplit && depth >= minDepth)
> >      {
> >          int bTryAmp = m_slice->m_sps->maxAMPDepth > depth &&
> (cuGeom.log2CUSize < 6 || m_param->rdLevel > 4);
> > @@ -504,28 +512,28 @@
> >          PMODE pmode(*this, cuGeom);
> >
> >          /* Initialize all prediction CUs based on parentCTU */
> > -        md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom);
> > -        md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom);
> > +        md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
> > +        md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
> >          if (bTryIntra)
> >          {
> > -            md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom);
> > +            md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom, qp);
> >              if (cuGeom.log2CUSize == 3 &&
> m_slice->m_sps->quadtreeTULog2MinSize < 3 && m_param->rdLevel >= 5)
> > -                md.pred[PRED_INTRA_NxN].cu.initSubCU(parentCTU, cuGeom);
> > +                md.pred[PRED_INTRA_NxN].cu.initSubCU(parentCTU, cuGeom,
> qp);
> >              pmode.modes[pmode.m_jobTotal++] = PRED_INTRA;
> >          }
> > -        md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom);
> pmode.modes[pmode.m_jobTotal++] = PRED_2Nx2N;
> > -        md.pred[PRED_BIDIR].cu.initSubCU(parentCTU, cuGeom);
> > +        md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
> pmode.modes[pmode.m_jobTotal++] = PRED_2Nx2N;
> > +        md.pred[PRED_BIDIR].cu.initSubCU(parentCTU, cuGeom, qp);
> >          if (m_param->bEnableRectInter)
> >          {
> > -            md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom);
> pmode.modes[pmode.m_jobTotal++] = PRED_2NxN;
> > -            md.pred[PRED_Nx2N].cu.initSubCU(parentCTU, cuGeom);
> pmode.modes[pmode.m_jobTotal++] = PRED_Nx2N;
> > +            md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom, qp);
> pmode.modes[pmode.m_jobTotal++] = PRED_2NxN;
> > +            md.pred[PRED_Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
> pmode.modes[pmode.m_jobTotal++] = PRED_Nx2N;
> >          }
> >          if (bTryAmp)
> >          {
> > -            md.pred[PRED_2NxnU].cu.initSubCU(parentCTU, cuGeom);
> pmode.modes[pmode.m_jobTotal++] = PRED_2NxnU;
> > -            md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom);
> pmode.modes[pmode.m_jobTotal++] = PRED_2NxnD;
> > -            md.pred[PRED_nLx2N].cu.initSubCU(parentCTU, cuGeom);
> pmode.modes[pmode.m_jobTotal++] = PRED_nLx2N;
> > -            md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom);
> pmode.modes[pmode.m_jobTotal++] = PRED_nRx2N;
> > +            md.pred[PRED_2NxnU].cu.initSubCU(parentCTU, cuGeom, qp);
> pmode.modes[pmode.m_jobTotal++] = PRED_2NxnU;
> > +            md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom, qp);
> pmode.modes[pmode.m_jobTotal++] = PRED_2NxnD;
> > +            md.pred[PRED_nLx2N].cu.initSubCU(parentCTU, cuGeom, qp);
> pmode.modes[pmode.m_jobTotal++] = PRED_nLx2N;
> > +            md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom, qp);
> pmode.modes[pmode.m_jobTotal++] = PRED_nRx2N;
> >          }
> >
> >          pmode.tryBondPeers(*m_frame->m_encData->m_jobProvider,
> pmode.m_jobTotal);
> > @@ -654,7 +662,7 @@
> >
> >          if (md.bestMode->rdCost == MAX_INT64 && !bTryIntra)
> >          {
> > -            md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom);
> > +            md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom, qp);
> >              checkIntraInInter(md.pred[PRED_INTRA], cuGeom);
> >              encodeIntraInInter(md.pred[PRED_INTRA], cuGeom);
> >              checkBestMode(md.pred[PRED_INTRA], depth);
> > @@ -680,7 +688,7 @@
> >          Mode* splitPred = &md.pred[PRED_SPLIT];
> >          splitPred->initCosts();
> >          CUData* splitCU = &splitPred->cu;
> > -        splitCU->initSubCU(parentCTU, cuGeom);
> > +        splitCU->initSubCU(parentCTU, cuGeom, qp);
> >
> >          uint32_t nextDepth = depth + 1;
> >          ModeDepth& nd = m_modeDepth[nextDepth];
> > @@ -744,13 +752,17 @@
> >      bool mightNotSplit = !(cuGeom.flags & CUGeom::SPLIT_MANDATORY);
> >      uint32_t minDepth = topSkipMinDepth(parentCTU, cuGeom);
> >
> > +    int32_t qp = 0;
> > +    if (depth <= (uint32_t)m_param->rc.maxCuDQPDepth)
> > +        qp = calculateQpforCuSize(parentCTU, cuGeom);
> > +
> >      if (mightNotSplit && depth >= minDepth)
> >      {
> >          bool bTryIntra = m_slice->m_sliceType != B_SLICE ||
> m_param->bIntraInBFrames;
> >
> >          /* Compute Merge Cost */
> > -        md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom);
> > -        md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom);
> > +        md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
> > +        md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
> >          checkMerge2Nx2N_rd0_4(md.pred[PRED_SKIP], md.pred[PRED_MERGE],
> cuGeom);
> >
> >          bool earlyskip = false;
> > @@ -759,24 +771,24 @@
> >
> >          if (!earlyskip)
> >          {
> > -            md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom);
> > +            md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
> >              checkInter_rd0_4(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N);
> >
> >              if (m_slice->m_sliceType == B_SLICE)
> >              {
> > -                md.pred[PRED_BIDIR].cu.initSubCU(parentCTU, cuGeom);
> > +                md.pred[PRED_BIDIR].cu.initSubCU(parentCTU, cuGeom, qp);
> >                  checkBidir2Nx2N(md.pred[PRED_2Nx2N],
> md.pred[PRED_BIDIR], cuGeom);
> >              }
> >
> >              Mode *bestInter = &md.pred[PRED_2Nx2N];
> >              if (m_param->bEnableRectInter)
> >              {
> > -                md.pred[PRED_Nx2N].cu.initSubCU(parentCTU, cuGeom);
> > +                md.pred[PRED_Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
> >                  checkInter_rd0_4(md.pred[PRED_Nx2N], cuGeom, SIZE_Nx2N);
> >                  if (md.pred[PRED_Nx2N].sa8dCost < bestInter->sa8dCost)
> >                      bestInter = &md.pred[PRED_Nx2N];
> >
> > -                md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom);
> > +                md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom, qp);
> >                  checkInter_rd0_4(md.pred[PRED_2NxN], cuGeom, SIZE_2NxN);
> >                  if (md.pred[PRED_2NxN].sa8dCost < bestInter->sa8dCost)
> >                      bestInter = &md.pred[PRED_2NxN];
> > @@ -798,24 +810,24 @@
> >
> >                  if (bHor)
> >                  {
> > -                    md.pred[PRED_2NxnU].cu.initSubCU(parentCTU, cuGeom);
> > +                    md.pred[PRED_2NxnU].cu.initSubCU(parentCTU, cuGeom,
> qp);
> >                      checkInter_rd0_4(md.pred[PRED_2NxnU], cuGeom,
> SIZE_2NxnU);
> >                      if (md.pred[PRED_2NxnU].sa8dCost <
> bestInter->sa8dCost)
> >                          bestInter = &md.pred[PRED_2NxnU];
> >
> > -                    md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom);
> > +                    md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom,
> qp);
> >                      checkInter_rd0_4(md.pred[PRED_2NxnD], cuGeom,
> SIZE_2NxnD);
> >                      if (md.pred[PRED_2NxnD].sa8dCost <
> bestInter->sa8dCost)
> >                          bestInter = &md.pred[PRED_2NxnD];
> >                  }
> >                  if (bVer)
> >                  {
> > -                    md.pred[PRED_nLx2N].cu.initSubCU(parentCTU, cuGeom);
> > +                    md.pred[PRED_nLx2N].cu.initSubCU(parentCTU, cuGeom,
> qp);
> >                      checkInter_rd0_4(md.pred[PRED_nLx2N], cuGeom,
> SIZE_nLx2N);
> >                      if (md.pred[PRED_nLx2N].sa8dCost <
> bestInter->sa8dCost)
> >                          bestInter = &md.pred[PRED_nLx2N];
> >
> > -                    md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom);
> > +                    md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom,
> qp);
> >                      checkInter_rd0_4(md.pred[PRED_nRx2N], cuGeom,
> SIZE_nRx2N);
> >                      if (md.pred[PRED_nRx2N].sa8dCost <
> bestInter->sa8dCost)
> >                          bestInter = &md.pred[PRED_nRx2N];
> > @@ -847,7 +859,7 @@
> >                  if ((bTryIntra && md.bestMode->cu.getQtRootCbf(0)) ||
> >                      md.bestMode->sa8dCost == MAX_INT64)
> >                  {
> > -                    md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom);
> > +                    md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom,
> qp);
> >                      checkIntraInInter(md.pred[PRED_INTRA], cuGeom);
> >                      encodeIntraInInter(md.pred[PRED_INTRA], cuGeom);
> >                      checkBestMode(md.pred[PRED_INTRA], depth);
> > @@ -865,7 +877,7 @@
> >
> >                  if (bTryIntra || md.bestMode->sa8dCost == MAX_INT64)
> >                  {
> > -                    md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom);
> > +                    md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom,
> qp);
> >                      checkIntraInInter(md.pred[PRED_INTRA], cuGeom);
> >                      if (md.pred[PRED_INTRA].sa8dCost <
> md.bestMode->sa8dCost)
> >                          md.bestMode = &md.pred[PRED_INTRA];
> > @@ -893,7 +905,7 @@
> >                      {
> >                          /* generate recon pixels with no rate
> distortion considerations */
> >                          CUData& cu = md.bestMode->cu;
> > -                        m_quant.setQPforQuant(cu);
> > +                        m_quant.setQPforQuant(cu, qp);
> >
> >                          uint32_t tuDepthRange[2];
> >                          cu.getInterTUQtDepthRange(tuDepthRange, 0);
> > @@ -918,7 +930,7 @@
> >                      {
> >                          /* generate recon pixels with no rate
> distortion considerations */
> >                          CUData& cu = md.bestMode->cu;
> > -                        m_quant.setQPforQuant(cu);
> > +                        m_quant.setQPforQuant(cu, qp);
> >
> >                          uint32_t tuDepthRange[2];
> >                          cu.getIntraTUQtDepthRange(tuDepthRange, 0);
> > @@ -952,7 +964,7 @@
> >          Mode* splitPred = &md.pred[PRED_SPLIT];
> >          splitPred->initCosts();
> >          CUData* splitCU = &splitPred->cu;
> > -        splitCU->initSubCU(parentCTU, cuGeom);
> > +        splitCU->initSubCU(parentCTU, cuGeom, qp);
> >
> >          uint32_t nextDepth = depth + 1;
> >          ModeDepth& nd = m_modeDepth[nextDepth];
> > @@ -1025,14 +1037,18 @@
> >      bool mightSplit = !(cuGeom.flags & CUGeom::LEAF);
> >      bool mightNotSplit = !(cuGeom.flags & CUGeom::SPLIT_MANDATORY);
> >
> > +    int32_t qp = 0;
> > +    if (depth <= (uint32_t)m_param->rc.maxCuDQPDepth)
> > +        qp = calculateQpforCuSize(parentCTU, cuGeom);
> > +
> >      if (m_param->analysisMode == X265_ANALYSIS_LOAD)
> >      {
> >          uint8_t* reuseDepth  =
> &m_reuseInterDataCTU->depth[parentCTU.m_cuAddr * parentCTU.m_numPartitions];
> >          uint8_t* reuseModes  =
> &m_reuseInterDataCTU->modes[parentCTU.m_cuAddr * parentCTU.m_numPartitions];
> >          if (mightNotSplit && depth == reuseDepth[zOrder] && zOrder ==
> cuGeom.absPartIdx && reuseModes[zOrder] == MODE_SKIP)
> >          {
> > -            md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom);
> > -            md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom);
> > +            md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
> > +            md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
> >              checkMerge2Nx2N_rd5_6(md.pred[PRED_SKIP],
> md.pred[PRED_MERGE], cuGeom, true);
> >
> >              if (m_bTryLossless)
> > @@ -1051,20 +1067,20 @@
> >
> >      if (mightNotSplit)
> >      {
> > -        md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom);
> > -        md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom);
> > +        md.pred[PRED_SKIP].cu.initSubCU(parentCTU, cuGeom, qp);
> > +        md.pred[PRED_MERGE].cu.initSubCU(parentCTU, cuGeom, qp);
> >          checkMerge2Nx2N_rd5_6(md.pred[PRED_SKIP], md.pred[PRED_MERGE],
> cuGeom, false);
> >          bool earlySkip = m_param->bEnableEarlySkip && md.bestMode &&
> !md.bestMode->cu.getQtRootCbf(0);
> >
> >          if (!earlySkip)
> >          {
> > -            md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom);
> > +            md.pred[PRED_2Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
> >              checkInter_rd5_6(md.pred[PRED_2Nx2N], cuGeom, SIZE_2Nx2N,
> false);
> >              checkBestMode(md.pred[PRED_2Nx2N], cuGeom.depth);
> >
> >              if (m_slice->m_sliceType == B_SLICE)
> >              {
> > -                md.pred[PRED_BIDIR].cu.initSubCU(parentCTU, cuGeom);
> > +                md.pred[PRED_BIDIR].cu.initSubCU(parentCTU, cuGeom, qp);
> >                  checkBidir2Nx2N(md.pred[PRED_2Nx2N],
> md.pred[PRED_BIDIR], cuGeom);
> >                  if (md.pred[PRED_BIDIR].sa8dCost < MAX_INT64)
> >                  {
> > @@ -1075,11 +1091,11 @@
> >
> >              if (m_param->bEnableRectInter)
> >              {
> > -                md.pred[PRED_Nx2N].cu.initSubCU(parentCTU, cuGeom);
> > +                md.pred[PRED_Nx2N].cu.initSubCU(parentCTU, cuGeom, qp);
> >                  checkInter_rd5_6(md.pred[PRED_Nx2N], cuGeom, SIZE_Nx2N,
> false);
> >                  checkBestMode(md.pred[PRED_Nx2N], cuGeom.depth);
> >
> > -                md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom);
> > +                md.pred[PRED_2NxN].cu.initSubCU(parentCTU, cuGeom, qp);
> >                  checkInter_rd5_6(md.pred[PRED_2NxN], cuGeom, SIZE_2NxN,
> false);
> >                  checkBestMode(md.pred[PRED_2NxN], cuGeom.depth);
> >              }
> > @@ -1102,21 +1118,21 @@
> >
> >                  if (bHor)
> >                  {
> > -                    md.pred[PRED_2NxnU].cu.initSubCU(parentCTU, cuGeom);
> > +                    md.pred[PRED_2NxnU].cu.initSubCU(parentCTU, cuGeom,
> qp);
> >                      checkInter_rd5_6(md.pred[PRED_2NxnU], cuGeom,
> SIZE_2NxnU, bMergeOnly);
> >                      checkBestMode(md.pred[PRED_2NxnU], cuGeom.depth);
> >
> > -                    md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom);
> > +                    md.pred[PRED_2NxnD].cu.initSubCU(parentCTU, cuGeom,
> qp);
> >                      checkInter_rd5_6(md.pred[PRED_2NxnD], cuGeom,
> SIZE_2NxnD, bMergeOnly);
> >                      checkBestMode(md.pred[PRED_2NxnD], cuGeom.depth);
> >                  }
> >                  if (bVer)
> >                  {
> > -                    md.pred[PRED_nLx2N].cu.initSubCU(parentCTU, cuGeom);
> > +                    md.pred[PRED_nLx2N].cu.initSubCU(parentCTU, cuGeom,
> qp);
> >                      checkInter_rd5_6(md.pred[PRED_nLx2N], cuGeom,
> SIZE_nLx2N, bMergeOnly);
> >                      checkBestMode(md.pred[PRED_nLx2N], cuGeom.depth);
> >
> > -                    md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom);
> > +                    md.pred[PRED_nRx2N].cu.initSubCU(parentCTU, cuGeom,
> qp);
> >                      checkInter_rd5_6(md.pred[PRED_nRx2N], cuGeom,
> SIZE_nRx2N, bMergeOnly);
> >                      checkBestMode(md.pred[PRED_nRx2N], cuGeom.depth);
> >                  }
> > @@ -1124,13 +1140,13 @@
> >
> >              if (m_slice->m_sliceType != B_SLICE ||
> m_param->bIntraInBFrames)
> >              {
> > -                md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom);
> > +                md.pred[PRED_INTRA].cu.initSubCU(parentCTU, cuGeom, qp);
> >                  checkIntra(md.pred[PRED_INTRA], cuGeom, SIZE_2Nx2N,
> NULL, NULL);
> >                  checkBestMode(md.pred[PRED_INTRA], depth);
> >
> >                  if (cuGeom.log2CUSize == 3 &&
> m_slice->m_sps->quadtreeTULog2MinSize < 3)
> >                  {
> > -                    md.pred[PRED_INTRA_NxN].cu.initSubCU(parentCTU,
> cuGeom);
> > +                    md.pred[PRED_INTRA_NxN].cu.initSubCU(parentCTU,
> cuGeom, qp);
> >                      checkIntra(md.pred[PRED_INTRA_NxN], cuGeom,
> SIZE_NxN, NULL, NULL);
> >                      checkBestMode(md.pred[PRED_INTRA_NxN], depth);
> >                  }
> > @@ -1150,7 +1166,7 @@
> >          Mode* splitPred = &md.pred[PRED_SPLIT];
> >          splitPred->initCosts();
> >          CUData* splitCU = &splitPred->cu;
> > -        splitCU->initSubCU(parentCTU, cuGeom);
> > +        splitCU->initSubCU(parentCTU, cuGeom, qp);
> >
> >          uint32_t nextDepth = depth + 1;
> >          ModeDepth& nd = m_modeDepth[nextDepth];
> > @@ -1896,7 +1912,7 @@
> >      return false;
> >  }
> >
> > -int Analysis::calculateQpforCuSize(CUData& ctu, const CUGeom& cuGeom)
> > +int Analysis::calculateQpforCuSize(const CUData& ctu, const CUGeom&
> cuGeom)
> >  {
> >      uint32_t ctuAddr = ctu.m_cuAddr;
> >      FrameData& curEncData = *m_frame->m_encData;
> > diff -r 6461985f33ac -r 615b61dd2be5 source/encoder/analysis.h
> > --- a/source/encoder/analysis.h       Sun Mar 15 11:58:32 2015 -0500
> > +++ b/source/encoder/analysis.h       Mon Mar 16 16:36:51 2015 +0530
> > @@ -139,7 +139,7 @@
> >      /* generate residual and recon pixels for an entire CTU recursively
> (RD0) */
> >      void encodeResidue(const CUData& parentCTU, const CUGeom& cuGeom);
> >
> > -    int calculateQpforCuSize(CUData& ctu, const CUGeom& cuGeom);
> > +    int calculateQpforCuSize(const CUData& ctu, const CUGeom& cuGeom);
> >
> >      /* check whether current mode is the new best */
> >      inline void checkBestMode(Mode& mode, uint32_t depth)
> > diff -r 6461985f33ac -r 615b61dd2be5 source/encoder/encoder.cpp
> > --- a/source/encoder/encoder.cpp      Sun Mar 15 11:58:32 2015 -0500
> > +++ b/source/encoder/encoder.cpp      Mon Mar 16 16:36:51 2015 +0530
> > @@ -1551,15 +1551,11 @@
> >      bool bIsVbv = m_param->rc.vbvBufferSize > 0 &&
> m_param->rc.vbvMaxBitrate > 0;
> >
> >      if (!m_param->bLossless && (m_param->rc.aqMode || bIsVbv))
> > -    {
> >          pps->bUseDQP = true;
> > -        pps->maxCuDQPDepth = 0; /* TODO: make configurable? */
> > -    }
> >      else
> > -    {
> >          pps->bUseDQP = false;
> > -        pps->maxCuDQPDepth = 0;
> > -    }
> > +
> > +    pps->maxCuDQPDepth = m_param->rc.maxCuDQPDepth;
> >
> >      pps->chromaQpOffset[0] = m_param->cbQpOffset;
> >      pps->chromaQpOffset[1] = m_param->crQpOffset;
> > @@ -1778,6 +1774,17 @@
> >          p->analysisMode = X265_ANALYSIS_OFF;
> >          x265_log(p, X265_LOG_WARNING, "Analysis save and load mode not
> supported for distributed mode analysis\n");
> >      }
> > +    bool bIsVbv = m_param->rc.vbvBufferSize > 0 &&
> m_param->rc.vbvMaxBitrate > 0;
> > +    if (!m_param->bLossless && (m_param->rc.aqMode || bIsVbv))
> > +    {
> > +        if (p->rc.maxCuDQPDepth > (NUM_CU_DEPTH - 2))
> > +        {
> > +            p->rc.maxCuDQPDepth = 0;
> > +            x265_log(p, X265_LOG_WARNING, "The maxCUDQPDepth should be
> less than maxCUDepth - 1(0, 1 or 2) setting maxCUDQPDepth = %d \n", 0);
> > +        }
> > +    }
> > +    else
> > +        p->rc.maxCuDQPDepth = 0;
>
> there should be some kind of a warning here explaining why the option
> the user asked for has been ignored
>

OK,


>
> >  }
> >
> >  void Encoder::allocAnalysis(x265_analysis_data* analysis)
> > diff -r 6461985f33ac -r 615b61dd2be5 source/x265.h
> > --- a/source/x265.h   Sun Mar 15 11:58:32 2015 -0500
> > +++ b/source/x265.h   Mon Mar 16 16:36:51 2015 +0530
> > @@ -977,6 +977,13 @@
> >          /* Enable stricter conditions to check bitrate deviations in
> CBR mode. May compromise
> >           * quality to maintain bitrate adherence */
> >          int bStrictCbr;
> > +
> > +        /* Max depth of a minimum CuDQP for sub-LCU-level delta QP
> > +         * the default maxCuDQPDepth is 0 then the CuDQP signaled once
> per CTU, this param
> > +         * enable the CuDQP signaled for sub-LCU-level also, minimum
> maxCuDQPDepth is 0
> > +         * and max maxCuDQPDepth is equal to maxCUDepth, always the
> CuDQP signaled
> > +         * if currentDepth is less than or equal to maxCuDQPDepth */
> > +        int maxCuDQPDepth;
>
> This is mixing LCU and CTU in the same paragraph. we've systematically
> removed all references to LCU in our public headers and docs and are
> using CTU everywhere. That's ignoring the fact that I can't make any
> sense of this description, as it's currently written.
>

OK, i will modify the discription and update the rest Doc also


>
> >      } rc;
> >
> >      /*== Video Usability Information ==*/
> > diff -r 6461985f33ac -r 615b61dd2be5 source/x265cli.h
> > --- a/source/x265cli.h        Sun Mar 15 11:58:32 2015 -0500
> > +++ b/source/x265cli.h        Mon Mar 16 16:36:51 2015 +0530
> > @@ -202,6 +202,7 @@
> >      { "strict-cbr",           no_argument, NULL, 0 },
> >      { "temporal-layers",      no_argument, NULL, 0 },
> >      { "no-temporal-layers",   no_argument, NULL, 0 },
> > +    { "max-dqp-depth",  required_argument, NULL, 0 },
> >      { 0, 0, 0, 0 },
> >      { 0, 0, 0, 0 },
> >      { 0, 0, 0, 0 },
> > _______________________________________________
> > x265-devel mailing list
> > [email protected]
> > https://mailman.videolan.org/listinfo/x265-devel
>
> --
> Steve Borho
> _______________________________________________
> x265-devel mailing list
> [email protected]
> https://mailman.videolan.org/listinfo/x265-devel
>



-- 
Thanks & Regards
Gopu G
Multicoreware Inc

_______________________________________________
x265-devel mailing list
[email protected]
https://mailman.videolan.org/listinfo/x265-devel

Re: [x265] [PATCH rfc] aq: implementation of Fine-grained Adaptive Quantization

Reply via email to