[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16433729#comment-16433729
 ] 

Weiwei Yang edited comment on YARN-7494 at 4/11/18 10:58 AM:
-------------------------------------------------------------

Sure, thanks for the review [~leftnoteasy].

Here is my review comments over v6 patch:

*CapacithScheduler*
 * line 458: unnecessary change

*CapacitySchedulerConfiguration*
 * line 2148: this can be replaced by #getTrimmedStrings
 * line 2157: need to add a check when the given policy name is not associated 
with a impl class, such config issue should fail the scheduler

*ClusterNodeTracker*
 * line 61: {{nodesPerLabel}} was initiated as an empty map and only be updated 
when there is a node update event, if there is no node label configured for the 
cluster, this will be just empty right? In another word, please make sure 
{{getNodesPerPartition}} returns all nodes if there is no partition configured.

*FiCaSchedulerApp*
 * I have commented this before, line 177/180, you can't call getCSLeafQueue 
because it assumes it is CS scheduler, I guess some of UT failures was caused 
by this. Can you fix it? I believe I also suggested how to fix it. But looks 
like you missed that.

*LeafQueue*
 * line 289: this will only look up the leaf queue config for a policy. This 
needs to be done in AbstractCSQueue in order to support inheriting polices from 
parent queues. I don't mind to address this in another Jira, just make sure we 
don't loss tracking this. A must do to me.

*DefaultMultiNodeLookupPolicy*
 * line 42: {{nodesPerPartition}} is not initiated
 * line 57: why this line is necessary? As we are doing back-end sorting, while 
retrieving the sorted list, it should be directly retrieving from the cache, 
not processing a sorting again.
 * line 57: {{#getPreferredNodeIterator}} this API should not depend on 
{{Collection<N> nodes}}, can you please double check.

*MultiNodeSorter*
 * line 81: I don't think we should give a default when met an invalid config, 
that would cause confusion to users. I prefer the fail-fast way and let the 
init fail.

*ResourceUsageBasedMultiNodeLookupPolicy*
 * what's the difference of this class to DefaultMultiNodeLookupPolicy

Thanks


was (Author: cheersyang):
Sure, thanks for the review [~leftnoteasy].

Here is my review comments over v6 patch:

*CapacithScheduler*
 * line 458: unnecessary change

*CapacitySchedulerConfiguration*
 * line 2148: this can be replaced by \{{#getTrimmedStrings}}
 * line 2157: need to add a check when the given policy name is not associated 
with a impl class, such config issue should fail the scheduler

*ClusterNodeTracker*
 * line 61: \{{nodesPerLabel}} was initiated as an empty map and only be 
updated when there is a node update event, if there is no node label configured 
for the cluster, this will be just empty right? In another word, please make 
sure \{{getNodesPerPartition}} returns all nodes if there is no partition 
configured.

*FiCaSchedulerApp*
 * I have commented this before, line 177/180, you can't call getCSLeafQueue 
because it assumes it is CS scheduler, I guess some of UT failures was caused 
by this. Can you fix it? I believe I also suggested how to fix it. But looks 
like you missed that.

*LeafQueue*
 * line 289: this will only look up the leaf queue config for a policy. This 
needs to be done in AbstractCSQueue in order to support inheriting polices from 
parent queues. I don't mind to address this in another Jira, just make sure we 
don't loss tracking this. A must do to me.

*DefaultMultiNodeLookupPolicy*
 * line 42: {{nodesPerPartition}} is not initiated
 * line 57: why this line is necessary? As we are doing back-end sorting, while 
retrieving the sorted list, it should be directly retrieving from the cache, 
not processing a sorting again.
 * line 57: \{{#getPreferredNodeIterator}} this API should not depend on 
\{{Collection<N> nodes}}, can you please double check.

*MultiNodeSorter*
 * line 81: I don't think we should give a default when met an invalid config, 
that would cause confusion to users. I prefer the fail-fast way and let the 
init fail.

*ResourceUsageBasedMultiNodeLookupPolicy*
 * what's the difference of this class to \{{DefaultMultiNodeLookupPolicy}}

Thanks

 

 

> Add muti node lookup support for better placement
> -------------------------------------------------
>
>                 Key: YARN-7494
>                 URL: https://issues.apache.org/jira/browse/YARN-7494
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacity scheduler
>            Reporter: Sunil G
>            Assignee: Sunil G
>            Priority: Major
>         Attachments: YARN-7494.001.patch, YARN-7494.002.patch, 
> YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, 
> YARN-7494.006.patch, YARN-7494.v0.patch, YARN-7494.v1.patch, 
> multi-node-designProposal.png
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup 
> based on partition to start with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to