[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16407831#comment-16407831
 ] 

Weiwei Yang commented on YARN-7494:
-----------------------------------

Hi [~sunilg]

Thanks for the updates,
{quote}each scheduler can register policies to monitor by common 
MultiNodeManager.
{quote}
I like this. MultiNodeManager becomes to be a RM service and scheduler 
registers required policies to this service. It's less coupled should be able 
to reuse in other schedulers.

I have read your patch carefully, please see my comments:

*The interfaces*
 Currently it looks like:
 # {{MultiNodeSortingManager}} maintains following mapping to manage multiple 
sorters
 PolicyName -> MultiNodeSorter
 # {{MultiNodeSorter}} is binding with a specific {{MultiNodeLookupPolicy}}, 
and it runs sorting thread to sort nodes by this policy in certain interval.
 # {{MultiNodeLookupPolicy}} is holding a certain {{NodesSortingAlgorithm}} 
which defines the actual sorting algorithm. User can plug their policy by 
implementing the algorithm interface.

*#1* and *#**2* are OK, but *#**3* looks redundant to me. As the public API for 
{{MultiNodeLookupPolicy}} and {{NodesSortingAlgorithm}} are too similar. I 
think we can remove {{NodesSortingAlgorithm}}.

*CapacityScheduler*
 # line 254: {{multiNodePlacementEnabled}} seems not be initiated and used by 
any place in this class.
 # line 1803, 1809: unnecessary change
 # line 1819: {{refreshLabelToNodeCache}}, it looks like you want to maintain 
label to nodes mapping in nodeTracker and refresh it in scheduler? I am 
wondering why this is necessary, we should be able to get nodes by partition 
directly from the node tracker.

*CapacitySchedulerConfiguration*
 # line 2044: {{getMultiNodesSortingAlgorithm}} in this method, we need to 
validate that the policy name is defined in 
{{yarn.scheduler.capacity.multi-node-sorting.policies}}
 misc: some other unnecessary changes

*ClusterNodeTracker*
 # Please see my comments for CapacityScheduler#3, I don't think we need a 
separate {{nodesPerLabel}} to store partition to nodes mapping. We could do 
simply

{code:java}
getNodes(node ->partition.equals(node.getPartition()));
{code}
*LocalityAppPlacementAllocator*
 # line 82, 103: can we directly get sorted nodes from 
{{MultiNodeSortingManager}}, in such case the invocation can be simplified.

*MultiNodeSorter*
 # line 52: \{{monitorInterval}} is not initiated. I think this value should be 
loaded from conf, that says each policy can be configured with its own sorting 
interval. And further, we can support if interval is 0, the policy runs 
on-demand sort.
 # line 104: this seems not support when we don't have any labels in the 
cluster. In this case, the nodes to sort should be all nodes in the cluster 
correct?

 

Please let me know if this makes sense.

Thanks

> Add muti node lookup support for better placement
> -------------------------------------------------
>
>                 Key: YARN-7494
>                 URL: https://issues.apache.org/jira/browse/YARN-7494
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacity scheduler
>            Reporter: Sunil G
>            Assignee: Sunil G
>            Priority: Major
>         Attachments: YARN-7494.001.patch, YARN-7494.002.patch, 
> YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.v0.patch, 
> YARN-7494.v1.patch, multi-node-designProposal.png
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup 
> based on partition to start with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to