[
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16407831#comment-16407831
]
Weiwei Yang commented on YARN-7494:
-----------------------------------
Hi [~sunilg]
Thanks for the updates,
{quote}each scheduler can register policies to monitor by common
MultiNodeManager.
{quote}
I like this. MultiNodeManager becomes to be a RM service and scheduler
registers required policies to this service. It's less coupled should be able
to reuse in other schedulers.
I have read your patch carefully, please see my comments:
*The interfaces*
Currently it looks like:
# {{MultiNodeSortingManager}} maintains following mapping to manage multiple
sorters
PolicyName -> MultiNodeSorter
# {{MultiNodeSorter}} is binding with a specific {{MultiNodeLookupPolicy}},
and it runs sorting thread to sort nodes by this policy in certain interval.
# {{MultiNodeLookupPolicy}} is holding a certain {{NodesSortingAlgorithm}}
which defines the actual sorting algorithm. User can plug their policy by
implementing the algorithm interface.
*#1* and *#**2* are OK, but *#**3* looks redundant to me. As the public API for
{{MultiNodeLookupPolicy}} and {{NodesSortingAlgorithm}} are too similar. I
think we can remove {{NodesSortingAlgorithm}}.
*CapacityScheduler*
# line 254: {{multiNodePlacementEnabled}} seems not be initiated and used by
any place in this class.
# line 1803, 1809: unnecessary change
# line 1819: {{refreshLabelToNodeCache}}, it looks like you want to maintain
label to nodes mapping in nodeTracker and refresh it in scheduler? I am
wondering why this is necessary, we should be able to get nodes by partition
directly from the node tracker.
*CapacitySchedulerConfiguration*
# line 2044: {{getMultiNodesSortingAlgorithm}} in this method, we need to
validate that the policy name is defined in
{{yarn.scheduler.capacity.multi-node-sorting.policies}}
misc: some other unnecessary changes
*ClusterNodeTracker*
# Please see my comments for CapacityScheduler#3, I don't think we need a
separate {{nodesPerLabel}} to store partition to nodes mapping. We could do
simply
{code:java}
getNodes(node ->partition.equals(node.getPartition()));
{code}
*LocalityAppPlacementAllocator*
# line 82, 103: can we directly get sorted nodes from
{{MultiNodeSortingManager}}, in such case the invocation can be simplified.
*MultiNodeSorter*
# line 52: \{{monitorInterval}} is not initiated. I think this value should be
loaded from conf, that says each policy can be configured with its own sorting
interval. And further, we can support if interval is 0, the policy runs
on-demand sort.
# line 104: this seems not support when we don't have any labels in the
cluster. In this case, the nodes to sort should be all nodes in the cluster
correct?
Please let me know if this makes sense.
Thanks
> Add muti node lookup support for better placement
> -------------------------------------------------
>
> Key: YARN-7494
> URL: https://issues.apache.org/jira/browse/YARN-7494
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: capacity scheduler
> Reporter: Sunil G
> Assignee: Sunil G
> Priority: Major
> Attachments: YARN-7494.001.patch, YARN-7494.002.patch,
> YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.v0.patch,
> YARN-7494.v1.patch, multi-node-designProposal.png
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup
> based on partition to start with.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]