[
https://issues.apache.org/jira/browse/YARN-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14257863#comment-14257863
]
Kannan Rajah commented on YARN-796:
-----------------------------------
[~leftnoteasy] I think we can improve the performance of the load balancing
logic in FairScheduler.continuousSchedulingAttempt when Label Based Scheduling
is active. I would like to get your input on this. If you believe this is a
valid improvement, I would like to work on a proposal and fix. Here is an
overview of the current logic.
{code}
for each node (ordered by cap remaining)
for each schedulable (ordered by fairness)
if a set of conditions are met
assign the container to node
{code}
Problem:
When LBS is enabled, the set of conditions will include the label match. A node
with maximum capacity remaining may not meet the label criteria. So why bother
going through a global set of nodes when only a subset of them can even be used
to schedule some applications. The effect could be profound in large cluster
with non overlapping node labels. What we really need is to track a set of "sub
clusters" and the applications that can be scheduled on them. Within each sub
cluster, we will maintain the node ordering by capacity remaining so that the
tasks are evenly distributed across nodes.
{code}
for each subcluster
if there are no applications belonging to it
continue
for each node in the subcluster (ordered by cap remaining)
for each schedulable (ordered by fairness)
if a set of conditions are met
assign the container to node
{code}
> Allow for (admin) labels on nodes and resource-requests
> -------------------------------------------------------
>
> Key: YARN-796
> URL: https://issues.apache.org/jira/browse/YARN-796
> Project: Hadoop YARN
> Issue Type: Sub-task
> Affects Versions: 2.4.1
> Reporter: Arun C Murthy
> Assignee: Wangda Tan
> Attachments: LabelBasedScheduling.pdf,
> Node-labels-Requirements-Design-doc-V1.pdf,
> Node-labels-Requirements-Design-doc-V2.pdf, YARN-796-Diagram.pdf,
> YARN-796.node-label.consolidate.1.patch,
> YARN-796.node-label.consolidate.10.patch,
> YARN-796.node-label.consolidate.11.patch,
> YARN-796.node-label.consolidate.12.patch,
> YARN-796.node-label.consolidate.13.patch,
> YARN-796.node-label.consolidate.14.patch,
> YARN-796.node-label.consolidate.2.patch,
> YARN-796.node-label.consolidate.3.patch,
> YARN-796.node-label.consolidate.4.patch,
> YARN-796.node-label.consolidate.5.patch,
> YARN-796.node-label.consolidate.6.patch,
> YARN-796.node-label.consolidate.7.patch,
> YARN-796.node-label.consolidate.8.patch, YARN-796.node-label.demo.patch.1,
> YARN-796.patch, YARN-796.patch4
>
>
> It will be useful for admins to specify labels for nodes. Examples of labels
> are OS, processor architecture etc.
> We should expose these labels and allow applications to specify labels on
> resource-requests.
> Obviously we need to support admin operations on adding/removing node labels.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)