[ 
https://issues.apache.org/jira/browse/YARN-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088436#comment-15088436
 ] 

Sangjin Lee commented on YARN-4062:
-----------------------------------

Thanks [~vrushalic] for the work! She and I went over the patch offline. Here 
are some comments on the patch. Please look at the javac, findbugs, checkstyle, 
javadoc, whitespace issues too.

(TimelineStorageUtils.java)
- l.572: I'm not too clear about this; should we break out of the for loop once 
we find the app id?

(TimestampGenerator.java)
- l.36: should we make this a round million?
- we should review the javadoc, and update it accordingly

(FlowRunCoprocessor.java)
- l.46: KeyValyeScanner is an unused import
- l.58: LOG is now used, so the annotation should be removed
- l.272-273: nit: you can simply do
{code}
request.isMajor() ? FlowScannerOperation.MAJOR_COMPACTION :
    FlowScannerOperation.MINOR_COMPACTION);
{code}

(FlowScanner.java)
- l.67: can you add this parameter as a real configuration (in 
YarnConfiguration.java/yarn-default.xml)?
- l.79: scanType is not used anywhere in the code?
- l.93: the instanceof operator returns false if the left operand is null, so 
the null check is superfluous
- l.131: getAggregationCompactionDimension() should be removed
- l.433: we discussed separating the logic of processing summation into a 
separate class so that it is more amenable to unit tests; it is not a strong 
preference, so please use your discretion to decide whether that is better
- l.447: typo: "order" -> "older"
- l.471: shouldn't we use NumericValueConverter.add() to add the values to 
maintain the generic nature of the types?
- l.490: nit: since FLOW_APP_ID is a static, we shouldn't use the "this" 
qualifier
- l.493: the same nit as above
- l.576: FlowScanner still has createNewCell(); it should be removed, right?

(TestFlowDataGenerator.java)
- l.21: unused import
- l.32: incorrect import (should be apache commons logging log)

(TestHBaseStorageFlowRunCompaction.java)
- l.23-82: there are 16 unused imports
- l.95-97: these 3 variables are unused
- l.166: admin is unused
- l.232-233: see above
- l.255: observer context is unused
- l.332: typo: "FINAl" -> "FINAL"
- l.414: typo: "FINAl" -> "FINAL"

We discussed how the co-processor can disable the compaction behavior without 
restarting the region server; (1) there should be a configuration/flag that 
enables and disables the compaction behavior, and (2) the co-processor should 
be able to pick up the flag value without restarting the region server. We 
don't need to do it as part of this JIRA. We can do it in YARN-4561.

(2) We also discussed how to handle the replication scenarios. It might be a 
good idea to use some kind of a regex matching against the cluster name as a 
way to whitelist or blacklist clusters for the compaction behavior. We can 
probably handle it in YARN-4561 as well.

> Add the flush and compaction functionality via coprocessors and scanners for 
> flow run table
> -------------------------------------------------------------------------------------------
>
>                 Key: YARN-4062
>                 URL: https://issues.apache.org/jira/browse/YARN-4062
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: timelineserver
>            Reporter: Vrushali C
>            Assignee: Vrushali C
>              Labels: yarn-2928-1st-milestone
>         Attachments: YARN-4062-YARN-2928.1.patch, 
> YARN-4062-feature-YARN-2928.01.patch
>
>
> As part of YARN-3901, coprocessor and scanner is being added for storing into 
> the flow_run table. It also needs a flush & compaction processing in the 
> coprocessor and perhaps a new scanner to deal with the data during flushing 
> and compaction stages. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to