[
https://issues.apache.org/jira/browse/YARN-10738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17338910#comment-17338910
]
Bibin Chundatt commented on YARN-10738:
---------------------------------------
[~zhuqi]
Following are the probable issue i see with using
ResourceUsageMultiNodeLookupPolicy on large cluster which could cause hot spots
The sorting happens based on available resource consider memory , cpu then
nodes ID.
# If the memory is available on node and vcores is full still we use the full
nodes for allocation attempt .
# On the cluster if we have nodes of diff resource sizes the hotspot cases
become more serious. The larger machines get preferred always creating under
utilization in lower profile machines.
# If all the nodes are of same size and not used then the ordering is based on
nodeID which could cause machines allocation attempt in canonical order
> When multi thread scheduling with multi node, we should shuffle with a gap to
> prevent hot accessing nodes.
> ----------------------------------------------------------------------------------------------------------
>
> Key: YARN-10738
> URL: https://issues.apache.org/jira/browse/YARN-10738
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Qi Zhu
> Assignee: Qi Zhu
> Priority: Major
> Labels: pull-request-available
> Time Spent: 50m
> Remaining Estimate: 0h
>
> Now the multi threading scheduling with multi node is not reasonable.
> In large clusters, it will cause the hot accessing nodes, which will lead the
> abnormal boom node.
> Solution:
> I think we should shuffle the sorted node (such the available resource sort
> policy) with an interval.
> I will solve the above problem, and avoid the hot accessing node.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]