[ 
https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15712803#comment-15712803
 ] 

Naganarasimha G R commented on YARN-3409:
-----------------------------------------

Thanks for support [~devaraj.k] & [~kkaranasos],
Its encouraging to see this feature would be useful for other scenarios too.
bq. Can NodeManagers have attribute names same as some label/partition name in 
the cluster? 
this is one of the reason i want to separate this into two different 
sets(partitions and constraints). So that the even if same name it will not 
overlap as partition and constraint expression differs. 
bq. Did you think about having one expression(existing) which handles node 
label expression and constraints expression without delimiter between label and 
constraints expressions, constraints expression support implementation can be 
added without any new configurations/interfaces.
Yes Deva, had considered this option and have described in section {{Topics for 
discussion -> ResourceRequest modifications -> Option 3}}, hope you have taken 
a look at it, if required we can further discuss more about it.
bq. Can we have some details about how the NodeManager report these attributes 
to ResourceManager?
Here we are discussing 2 things from NM side, one is supporting  scripts or 
configs in the NM side which is already being supported as part of YARN-2495, 
latest documentation you can refer 
[features|http://hadoop.apache.org/docs/r3.0.0-alpha1/hadoop-yarn/hadoop-yarn-site/NodeLabel.html#Features].

bq. I also (strongly ) suggest to use these ConstraintNodeLabels in the context 
of YARN-1042 for supporting (anti-)affinity constraints as well. I think it 
will greatly avoid duplicate effort and simplify the code.
Agree, i presume it should be pluggable enough, either at the NM or the RM side 
for these app specific constraints on the nodes. May be when my initial 
prototype is ready we can further discuss about this. 

bq. On a similar note, can these ContraintNodeLabels be added/removed 
dynamically? For example, when a container starts its execution, it might 
contain some attributes (to be added – I know such attributes cannot be 
specified at the moment). Those attributes will then be added to the node's 
labels, for the time the container is running. This can be useful for 
(anti-)affinity constraints.
Yeah we were looking at some scenarios where in NM's side resource stats like 
loadaverage, swap etc.. for dynamic constraint values. But what you specified 
is also an interesting use case which can easily fit in the proposed scheme of 
things (but we need to be careful of the security scenarios for such type of 
constraints though).

bq. so we might consider using a name that denotes that 
Well actually the labels(partition) could have been named as pools and 
constraints as labels, but anyway for existing i am fine 
{{ConstraintExpression}}/{{AttributeExpression}} but would prefer the former as 
to an extent its been already in usage.

bq.  It might also be that the implementation of ConstraintNodeLabels will be 
easier at some places than that of NodeLabels/Partitions
Yes agree as there is less modifications in the scheduler node hence relatively 
simple but may be multiple places we might have to modify to make it usable 
hence we proposed for branch. Once i upload the WIP patches will create Branch 
and umbrella. Thought of cloning this jira to continue discussions here and 
jira tracking in the other.

bq. Can you please give an example of a cluster-level constraint?
Cluster-level constraint is similar to the existing {{ClusterNodeLabels}} which 
is super set of all the available node labels which will be used by RM for 
validating the expression and scheduling.  As we plane to support different 
type of Constraints, we require to maintain this super set, so that constraints 
reported for one node do not conflict with the other (if the types are 
different).

bq. Making sure I understand.. Why do we need this constraint? I think they are 
orthogonal, right? Unless you mean that if the user specifies a constraint, it 
has to be taken into account too, which I understand.
Agree its orthogonal, but partition is like a logical cluster and when user 
specifies a partition then nodes under the partition satisfying the constraint 
labels needs to be picked to ensure the compatibility of the current partition 
feature (partition capacity planning, headroom etc). Currently we are only 
supporting partition expression to have only one partition, i envisage like 
partition1||partition2  with constraint expression as (HAS_GPU && !Windows), so 
that all the nodes of Partition 1 and Parttion2 satifying the constraint 
expression will be picked for scheduling of the container.

bq. We assumed that the ConstraintNodeLabels are following the hierarchy of the 
cluster. That is, a rack was inheriting the ConstraintNodeLabels of all its 
nodes. A detail here is that we considered only existential 
ConstraintNodeLabels 
May be i did not get what you mentioned more clearly. What we were discussing 
was like constraint expression could differ for different ResourceRequests, 
(node/rack/any) could support different constraint labels. Like if i get 
locality then i don't want any constraints and if i get rack local nodes pick 
node say supporting (HAS_SSD || NUM_DISKS>4) and if for any then say pick nodes 
with different expression say (NUM_SSD_DISK > 4). 
But I was not able to completely understand the need to have constraints for 
the rack itself. IIUC even now or the global scheduler does scheduling based on 
node itself and not @ one layer above i.e rack. But yes when selecting the node 
we do check its rack but we have the constraints of the node we need not worry 
about the rack right?
bq. For instance, group nodes that belong to the same upgrade domain (being 
upgraded at the same time – we see this use case a lot in our clusters).
With the proposed approach we could achieve with a new String based label like 
{{upgrade_group}} and may be have values for the node like *group1, group2 ...*

Sorry for the delay, i have WIP patch with test cases for expression evaluation 
(with custom Constraint) almost ready. Just final nits of expression doc i am 
updating, will upload by tomorrow. In next week will upload for WIP on 
scheduling and configuration. This would help us in clarifying how we view this 
feature and can think more on further optimizations.



> Add constraint node labels
> --------------------------
>
>                 Key: YARN-3409
>                 URL: https://issues.apache.org/jira/browse/YARN-3409
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, capacityscheduler, client
>            Reporter: Wangda Tan
>            Assignee: Naganarasimha G R
>         Attachments: Constraint-Node-Labels-Requirements-Design-doc_v1.pdf
>
>
> Specify only one label for each node (IAW, partition a cluster) is a way to 
> determinate how resources of a special set of nodes could be shared by a 
> group of entities (like teams, departments, etc.). Partitions of a cluster 
> has following characteristics:
> - Cluster divided to several disjoint sub clusters.
> - ACL/priority can apply on partition (Only market team / marke team has 
> priority to use the partition).
> - Percentage of capacities can apply on partition (Market team has 40% 
> minimum capacity and Dev team has 60% of minimum capacity of the partition).
> Constraints are orthogonal to partition, they’re describing attributes of 
> node’s hardware/software just for affinity. Some example of constraints:
> - glibc version
> - JDK version
> - Type of CPU (x86_64/i686)
> - Type of OS (windows, linux, etc.)
> With this, application can be able to ask for resource has (glibc.version >= 
> 2.20 && JDK.version >= 8u20 && x86_64).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to