Wangda Tan updated YARN-796:

    Attachment: YARN-796.node-label.demo.patch.1

Hi guys,
Thanks for your input in the past several weeks, I implemented a patch based 
the design doc: 
 during the past two weeks. Really appreciate if you can take a look. The patch 
is: YARN-796.node-label.demo.patch.1 (I made a longer name to not confuse with 
other patches).

*Already included in this patch:*
* Protocol changes for ResourceRequest, ApplicationSubmissionContext (leveraged 
contribution from Yuliya's patch, thanks). also updated AMRMClient
* RMAdmin changes to dynamically update labels of node (add/set/remove), also 
updated RMAdmin CLI
* Capacity scheduler related changes including: 
** headroom calculation, preemption, container allocation respect labels. 
** Allow user set list of labels of a queue can access in capacity-scheduler.xml
* A centralized node label manager can be updated dynamically to add/set/remove 
labels, and can store labels to file system. It will work with RM restart/HA 
scenario (Similar to RMStateStore).
* Support set {{--labels}} option in distributed shell, we can use distributed 
shell to test this feature
* Related unit tests

*Will include later:*
* RM REST APIs for node label
* Distributed configuration (set labels in yarn-site.xml of NMs)
* Support labels in FairScheduler

*Try this patch*
1. Create a capacity-scheduler.xml with labels accessible on queues
   /  \
  a    b
  |    |
  a1   b1

a.capacity = 50, b.capacity = 50 
a1.capacity = 100, b1.capacity = 100

And a.label = red,blue; b.label = blue,green
    <value>red, blue</value>

    <value>blue, green</value>
This means queue a (And its sub queues) CAN access label red and blue; queue b 
(And its sub queues) CAN access label blue and green

2. Create a node-labels.json locally, this is initial labels on nodes, (you can 
dynamically change it using rmadmin CLI while RM is running, you don't have to 
do it). And set {{yarn.resourcemanager.labels.node-to-label-json.path}} to 
       "labels":["red", "blue"]
       "labels":["blue", "green"]
This sets red/blue labels on host1, and sets blue/green labels on host2

3. Start Yarn cluster (if you have several nodes in the cluster, you need 
launch HDFS to use distributed shell)
* Submit a distributed shell:
hadoop jar path/to/*distributedshell*.jar 
org.apache.hadoop.yarn.applications.distributedshell.Client -shell_command 
hostname -jar path/to/*distributedshell*.jar -num_containers 10 -labels "red && 
blue" -queue a1
This will run a distributed shell, launch 10 containers, and the command run is 
"hostname", asked label is "red && blue", all containers will be allocated on 

Some other examples:
* {{-queue a1 -labels "red && green"}}, this will be rejected, because queue a1 
cannot access label green
* {{-queue a1 -labels "blue"}}, some containers will be allocated on host1, and 
some others will be allocated to host2, because both of host1/host2 contain 
"blue" label
* {{-queue b1 -labels "green"}}, all containers will be allocated on host2

4. Dynamically update labels using rmadmin CLI
// dynamically add labels x, y to label manager
yarn rmadmin -addLabels x,y

// dynamically set label x on node1, and label y on node2
yarn rmadmin -setNodeToLabels "node1:x;node2:x,y

// remove labels from label manager, and also remove labels on nodes
yarn rmadmin -removeLabels x

*Two more examples for node label*
1. Labels as constraints:
Queue structure:
   / | \
  a  b  c

a has label: WINDOWS, LINUX, GPU
c doesn't have label

25 nodes in the cluster:
h1-h5:   LINUX, GPU
h6-h10:  LINUX,
h21-h25: <empty>
If you want "LINUX && GPU" resource, you should submit to queue-a, and set 
label in Resource Request to "LINUX && GPU"
If you want "LARGE_MEM" resource, and don't mind its OS, you can submit to 
queue-b, and set label in Resource Request to "LARGE_MEM"
If you want to allocate on nodes don't have labels (h21-h25), you can submit it 
to any queue, and leave label in Resource Request empty

2. Labels to hard partition cluster
Queue structure:
   / | \
  a  b  c

a has label: MARKETING
b has label: HR
c has label: RD

15 nodes in the cluster:
h1-h5:   MARKETING
h6-h10:  HR
h11-h15: RD
Now cluster is hard partitioned to 3 small clusters, h1-h5 for marketing, only 
queue-A can use it, you should set label in Resource Request to "a". Similar to 
HR/RD cluster. 

I appreciate your feedbacks of this patch, do you think is it correct 
direction? If you think it's fine, I will break down the patch to several small 
patches and create some sub JIRAs for easier review.

Wangda Tan

> Allow for (admin) labels on nodes and resource-requests
> -------------------------------------------------------
>                 Key: YARN-796
>                 URL: https://issues.apache.org/jira/browse/YARN-796
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>    Affects Versions: 2.4.1
>            Reporter: Arun C Murthy
>            Assignee: Wangda Tan
>         Attachments: LabelBasedScheduling.pdf, 
> Node-labels-Requirements-Design-doc-V1.pdf, 
> Node-labels-Requirements-Design-doc-V2.pdf, YARN-796.node-label.demo.patch.1, 
> YARN-796.patch, YARN-796.patch4
> It will be useful for admins to specify labels for nodes. Examples of labels 
> are OS, processor architecture etc.
> We should expose these labels and allow applications to specify labels on 
> resource-requests.
> Obviously we need to support admin operations on adding/removing node labels.

This message was sent by Atlassian JIRA

Reply via email to