[ 
https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15712946#comment-15712946
 ] 

Wangda Tan commented on YARN-3409:
----------------------------------

Thanks all for discussion, sorry for the delayed response, I'm still on 
vacation and I should be able to participate more discussions from next week.

[~Naganarasimha], I think POC will be helpful to understanding the scope, but 
from my POV, the biggest challenge of this task is properly design API and 
deciding how to make it work with existing features (like locality/partition) 
and future features (like YARN-4902 / global scheduling, etc.)

Following are overall thoughts in my mind:

h3. 1. For the (old) ResourceRequest API, we have a couple of choices:

*Choice a.*
Use nodeLabelExpression to specify nodePartition and nodeConstraint altogether
Pros:
- This has good semantic, since "constraint" could be considered as one special 
kind of "label"

Cons: 
- We have to add a new field for affinity/anti-affinity
- Some existing implementation assumes "nodeLabelExpression" equals to partition

*Choice b.*
Add a new field for constraint expression, and also for affnity/anti-affinity 
(Per suggested by Kostas). This should have minimum impact to existing 
features. But after this, the "nodeLabelExpression becomes a little ambiguous, 
we may need to deprecate existing nodeLabelExpression.

*My preference:*
*Personally I prefer b.* but it's better to rename nodeConstraintExpression to 
placementStrategy so we can have consistent naming and semantics after 
YARN-4902. Actually in my POC patch of YARN-1042 
(https://issues.apache.org/jira/secure/attachment/12822186/YARN-1042-global-scheduling.poc.1.patch#file-0),
 I use the {{pacementStrategy}} as the name

h3. 2. For CLI / REST API to manage (add/remove) and get constraint, we also 
have a couple of choices:

*Choice a.*
Add a set of new APIs / CLI to manage node constraints, for example, we can 
have a REST API {{POST /node-constraints/add}} to add node constraints

*Choice b.*
Extend existing {{NodeLabel}} object to support node constraint, we only need 
two additional field to support node constraint. 1) isNodeConstraint 2) Value 
(For example, we can have a constraint named jdk-verion, and value could be 
6/7/8).

*My perference:*
Personally I prefer b. Since we can reuse most of existing CLI / REST API 
implementations.

I found there're some other discussions need to be settled. Such as, support 
rack constraint, should we have node constraint added to non-ANY request, etc. 
I suggest to discuss them after we get consensus about above two points.

Please share your thoughts (Clearly stating which one you most prefer with some 
explanations will be better). 

+ [~naganarasimha...@apache.org], [~kkaranasos], [~devaraj.k], [~varun_saxena].

> Add constraint node labels
> --------------------------
>
>                 Key: YARN-3409
>                 URL: https://issues.apache.org/jira/browse/YARN-3409
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, capacityscheduler, client
>            Reporter: Wangda Tan
>            Assignee: Naganarasimha G R
>         Attachments: Constraint-Node-Labels-Requirements-Design-doc_v1.pdf
>
>
> Specify only one label for each node (IAW, partition a cluster) is a way to 
> determinate how resources of a special set of nodes could be shared by a 
> group of entities (like teams, departments, etc.). Partitions of a cluster 
> has following characteristics:
> - Cluster divided to several disjoint sub clusters.
> - ACL/priority can apply on partition (Only market team / marke team has 
> priority to use the partition).
> - Percentage of capacities can apply on partition (Market team has 40% 
> minimum capacity and Dev team has 60% of minimum capacity of the partition).
> Constraints are orthogonal to partition, they’re describing attributes of 
> node’s hardware/software just for affinity. Some example of constraints:
> - glibc version
> - JDK version
> - Type of CPU (x86_64/i686)
> - Type of OS (windows, linux, etc.)
> With this, application can be able to ask for resource has (glibc.version >= 
> 2.20 && JDK.version >= 8u20 && x86_64).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to