bc Wong commented on YARN-796:

[~yufeldman] & [~sdaingade], just read your proposal 
(LabelBasedScheduling.pdf). Has a few comments:

1. *Would let each node report its own labels.* The current proposal specifies 
the node-label mapping in a centralized file. This seems operationally 
unfriendly, as the file is hard to maintain.
* You need to get the DNS name right, which could be hard for a multi-homed 
* The proposal uses regexes on FQDN, such as {{perfnode.*}}. This may work if 
the hostnames are set up by IT like that. But in reality, I've seen lots of 
sites where the FQDN is like {{stmp09wk0013.foobar.com}}, where "stmp" refers 
to the data center, and "wk0013" refers to "worker 13", and other weird stuff 
like that. Now imagine that a centralized node-label mapping file with 2000 
nodes with such names. It'd be a nightmare.

Instead, each node can supply its own labels, via 
{{yarn.nodemanager.node.labels}} (which specifies labels directly) or 
{{yarn.nodemanager.node.labelFile}} (which points to a file that has a single 
line containing all the labels). It's easy to generate the label file for each 
node. The admin can have puppet push it out, or populate it when the VM is 
built, or compute it in a local script by inspecting /proc. (Oh I have 192GB, 
so add the label "largeMem".) There is little room for mistake.

The NM can still periodically refreshes its own labels, and update the RM via 
the heartbeat mechanism. The RM should also expose a "node label report", which 
is the real-time information of all nodes and their labels.

2. *Labels are per-container, not per-app. Right?* The doc keeps mentioning 
"application label", "ApplicationLabelExpression", etc. Should those be 
"container label" instead? I just want to confirm that each container request 
can carry its own label expression. Example use case: Only the mappers need 
GPU, not the reducers.

3. *Can we fail container requests with no satisfying nodes?* In 
"Considerations, #5", you wrote that the app would be in waiting state. Seems 
that a fail-fast behaviour would be better. If no node can satisfy the label 
expression, then it's better to tell the client "no". Very likely somebody made 
a typo somewhere.

> Allow for (admin) labels on nodes and resource-requests
> -------------------------------------------------------
>                 Key: YARN-796
>                 URL: https://issues.apache.org/jira/browse/YARN-796
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Arun C Murthy
>            Assignee: Wangda Tan
>         Attachments: LabelBasedScheduling.pdf, YARN-796.patch
> It will be useful for admins to specify labels for nodes. Examples of labels 
> are OS, processor architecture etc.
> We should expose these labels and allow applications to specify labels on 
> resource-requests.
> Obviously we need to support admin operations on adding/removing node labels.

This message was sent by Atlassian JIRA

Reply via email to