[ 
https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15751851#comment-15751851
 ] 

Naganarasimha G R commented on YARN-3409:
-----------------------------------------


Thanks [~wangda], [~kkaranasos] & [~grey], for sharing your thoughts.
I agree the general idea of Top down(coming from the requirements) but lets not 
again limit ourself to current problem and bring up the API and later have 
minimal bandwidth for future requirements (due to compatability issues). And 
based on past experience it will take long time to come back and update once we 
define the interface. Besides we do have some scenarios where in we require 
numerical comparisons are required at the minimum. 

I kind of agree on some of @wangda's points like exposing {{public 
constraint-type}} would be added overhead to maintain it as compatabile 
interface even in future. but the point i would want to make is atleast we 
should have predetermined types (like wangda mentioned earlier String, Long, 
Version (long array followed by a string)). But i would still prefer to have a 
constraint type (defaulting to string) so that the request doesnt hang because 
of wrong expression being provided or wrongly select as particular type based 
on regex. If its predefined types then its easier to inform as part of response 
or throw an exception that the given expresion is invalid. it would definitely 
save the processing time too, when they do not match.
For other complexitites mentioned :
{quote} 
* Change type of constraint
* Remove/Add constraint (Remove constraint need to check if it is associate 
with other node, etc.)
{quote} 
This would be same as modifying the existing partition type (exclusive to non 
exclusive and vice versa). And i feels its not too complex and as you mentioned 
if the number of references to this constraint becomes zero then we can allow 
to remove it. Similar thing is already supported for partitions currently

Going one by one on the examples taken would be a long debatable discussion but 
at the least one straight forward scenario for numerical comparison which we 
have is number of disks, We have CarbonData(incubator apache project) which is 
kind of more of OLAP processing. We usually deploy with other bigdata projects 
and hence its more like heterogenous workloads on hetergenous hardware. When we 
do bulk load of data(TB's) into CarbonData (using spark) we tend to have lot of 
intermediate data, To parallelize the processing we not only require more cores 
but good number of disks too, based on the partitions we generally could 
determine how many disks would be good to achieve better load performance. If 
we bring in the number of disks as resource type it will be enforced on all the 
containers running in the cluster and it will limit the cluster usability. So 
it would be ideal to specify at least for these kind of tasks to get the Nodes 
with at least the specified number of disks to improve the load performance 
with additional constraints like anti affinity to aid it.
Can explain (/rather debate :) ) other scenarios too, may be in the offline 
meeting.

bq. For example, using simple labels, one node can be annotated with label 
"Java6". Then a task that requires at least Java 5 can request for a node with 
"Java5 || Java6". I think that with our current use cases, this will be 
sufficient.
These kind of Version comparisions will have impact when admin upgrades the 
Java to particular higher version. Then apps which are already running will be 
impacted. Admin will not be aware of the apps which are dependent on the Java 
version. So good thing would be to have version > x.y.z kinds

bq. Could you add some references here?
Well may be i dint state what i intended or it was wrongly took as other 
scheduler support, Different schedulers like IBM LSF,Mesos and Kubernetes 
support labels different in different ways for different scenarios, Some 
support expressions on load average/swap,  some support different kind of 
expressions(SET, RANGE) and some support both string based constraints. So i 
tried to generalize it as extendable customized labels and not intended to say 
that its already been supported.

LSF : 
http://www.ibm.com/support/knowledgecenter/SSETD4_9.1.3/lsf_admin/select_string.html
MESOS : http://mesos.apache.org/documentation/latest/attributes-resources/
KUBERNETES : http://kubernetes.io/docs/user-guide/node-selection/

Regarding concidering antiaffinity and affinity within the scope of 
constraints, i had few queries/concerns
# current syntax what you mentioned captures for antiaffinity and affinity with 
apps, but how about app's priority(different type of tasks) ? looks very 
specific syntax needs to be introduced for that too ?
# According to YARN-1042 you had hinted out target as host/rack/partition. If 
we start supporting within constraints i doubt we will be later consider these 
kind of requirements in future, or start having specific notations into 
constraint expression. To have better flexibility would opt it as separate 
entity rather than club it here.
# Taking a look at the 1042 patch i understand that you want to make use of the 
name of the application and numerical priority value itself as the constraint 
names. This approach looks good but had few queries/apprehensions. 
* consider the hbase case where in we assume that master be named as 
Hbase_Master and regionserver priority is say 5, now while  bulkloading i would 
require the reducer to run in the location specific to a particular region 
server rather than any regionserver, with this approach its not possible or 
force hbase to use different priorities for different regionservers which cause 
priority inversions etc . (Its just an example there are alternates to acheive 
it but there can be other similar scenarios )
* IIUC we do not ensure appnames to be unique in the cluster, assume we want to 
have 2 hbase setups in the same cluster then there should be a way to specify 
affinity and anti affinity to individual hbase setup.

Initially when [~kkaranasos] mentioned for affinity and anti affinity i 
visualized it like App can define constraints on the fly which will be valid 
for the scope of application (and/container for different priorities) And RM 
labels the nodes with the app specified constraint name and value when the node 
runs the specified app's AM/container. So in this way we will be able to 
specify affinity/anti affinity for multiple instances of same apps and its 
different containers. Its not a completely thought approach but just my two 
cents.



> Add constraint node labels
> --------------------------
>
>                 Key: YARN-3409
>                 URL: https://issues.apache.org/jira/browse/YARN-3409
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, capacityscheduler, client
>            Reporter: Wangda Tan
>            Assignee: Naganarasimha G R
>         Attachments: Constraint-Node-Labels-Requirements-Design-doc_v1.pdf, 
> YARN-3409.WIP.001.patch
>
>
> Specify only one label for each node (IAW, partition a cluster) is a way to 
> determinate how resources of a special set of nodes could be shared by a 
> group of entities (like teams, departments, etc.). Partitions of a cluster 
> has following characteristics:
> - Cluster divided to several disjoint sub clusters.
> - ACL/priority can apply on partition (Only market team / marke team has 
> priority to use the partition).
> - Percentage of capacities can apply on partition (Market team has 40% 
> minimum capacity and Dev team has 60% of minimum capacity of the partition).
> Constraints are orthogonal to partition, they’re describing attributes of 
> node’s hardware/software just for affinity. Some example of constraints:
> - glibc version
> - JDK version
> - Type of CPU (x86_64/i686)
> - Type of OS (windows, linux, etc.)
> With this, application can be able to ask for resource has (glibc.version >= 
> 2.20 && JDK.version >= 8u20 && x86_64).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to