Wangda Tan commented on YARN-3214:

Hi [~lohit],
Problems of multiple labels on a same node and at most one label on each node 
are quite different:

At most one label on each node makes a cluster becomes several disjoint 
sub-clusters, all scheduling algorithms (no matter if using capacity/fair/fifo) 
can just simply run on the sub-cluster. 

If you want to divide resource for queues on labels (as example above, queue-A 
can use 40% of label-X and queue-B can use 60% of label-X) when we support 
multiple labels (say X and Y) on a same node (say node1), sub-clusters will 
become overlapping, that makes scheduling very hard:
When qA can access X and qB can access Y, how much resource of node1 you plan 
to allocate to qA/qB? A more complex example is, node1 has X,Y; node2 has X 
only, node3 has X,Z. This is a very tough problem and as far as I know (please 
let me know if I missed anything), there's no platform perfectly solved this.

So this is why separating partition vs. attribute/constraints becomes 
important. Partition is a way to divide cluster, each sub-cluster has similar 
properties (like how to share to queues) to a general cluster resource setting, 
that will be useful when a set of nodes contributed and shared to only a subset 
of queues of the entire cluster. Attribute is a just way to allocate container, 
a simple way to improve attribute/constraint is FCFS (first come first serve), 
no quota will be assigned to each attribute.

Mesos is different here, it doesn't do anything for node attributes in 
scheduling side, all node attributes will be directly passed to framework side 
directly, and framework will decide if accept or reject offer according to its 
node attributes, it will not take care of how to balance framework shares on 
each attributes.

> Add non-exclusive node labels 
> ------------------------------
>                 Key: YARN-3214
>                 URL: https://issues.apache.org/jira/browse/YARN-3214
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacityscheduler, resourcemanager
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: Non-exclusive-Node-Partition-Design.pdf
> Currently node labels partition the cluster to some sub-clusters so resources 
> cannot be shared between partitioned cluster. 
> With the current implementation of node labels we cannot use the cluster 
> optimally and the throughput of the cluster will suffer.
> We are proposing adding non-exclusive node labels:
> 1. Labeled apps get the preference on Labeled nodes 
> 2. If there is no ask for labeled resources we can assign those nodes to non 
> labeled apps
> 3. If there is any future ask for those resources , we will preempt the non 
> labeled apps and give them back to labeled apps.

This message was sent by Atlassian JIRA

Reply via email to