[
https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15751851#comment-15751851
]
Naganarasimha G R commented on YARN-3409:
-----------------------------------------
Thanks [~wangda], [~kkaranasos] & [~grey], for sharing your thoughts.
I agree the general idea of Top down(coming from the requirements) but lets not
again limit ourself to current problem and bring up the API and later have
minimal bandwidth for future requirements (due to compatability issues). And
based on past experience it will take long time to come back and update once we
define the interface. Besides we do have some scenarios where in we require
numerical comparisons are required at the minimum.
I kind of agree on some of @wangda's points like exposing {{public
constraint-type}} would be added overhead to maintain it as compatabile
interface even in future. but the point i would want to make is atleast we
should have predetermined types (like wangda mentioned earlier String, Long,
Version (long array followed by a string)). But i would still prefer to have a
constraint type (defaulting to string) so that the request doesnt hang because
of wrong expression being provided or wrongly select as particular type based
on regex. If its predefined types then its easier to inform as part of response
or throw an exception that the given expresion is invalid. it would definitely
save the processing time too, when they do not match.
For other complexitites mentioned :
{quote}
* Change type of constraint
* Remove/Add constraint (Remove constraint need to check if it is associate
with other node, etc.)
{quote}
This would be same as modifying the existing partition type (exclusive to non
exclusive and vice versa). And i feels its not too complex and as you mentioned
if the number of references to this constraint becomes zero then we can allow
to remove it. Similar thing is already supported for partitions currently
Going one by one on the examples taken would be a long debatable discussion but
at the least one straight forward scenario for numerical comparison which we
have is number of disks, We have CarbonData(incubator apache project) which is
kind of more of OLAP processing. We usually deploy with other bigdata projects
and hence its more like heterogenous workloads on hetergenous hardware. When we
do bulk load of data(TB's) into CarbonData (using spark) we tend to have lot of
intermediate data, To parallelize the processing we not only require more cores
but good number of disks too, based on the partitions we generally could
determine how many disks would be good to achieve better load performance. If
we bring in the number of disks as resource type it will be enforced on all the
containers running in the cluster and it will limit the cluster usability. So
it would be ideal to specify at least for these kind of tasks to get the Nodes
with at least the specified number of disks to improve the load performance
with additional constraints like anti affinity to aid it.
Can explain (/rather debate :) ) other scenarios too, may be in the offline
meeting.
bq. For example, using simple labels, one node can be annotated with label
"Java6". Then a task that requires at least Java 5 can request for a node with
"Java5 || Java6". I think that with our current use cases, this will be
sufficient.
These kind of Version comparisions will have impact when admin upgrades the
Java to particular higher version. Then apps which are already running will be
impacted. Admin will not be aware of the apps which are dependent on the Java
version. So good thing would be to have version > x.y.z kinds
bq. Could you add some references here?
Well may be i dint state what i intended or it was wrongly took as other
scheduler support, Different schedulers like IBM LSF,Mesos and Kubernetes
support labels different in different ways for different scenarios, Some
support expressions on load average/swap, some support different kind of
expressions(SET, RANGE) and some support both string based constraints. So i
tried to generalize it as extendable customized labels and not intended to say
that its already been supported.
LSF :
http://www.ibm.com/support/knowledgecenter/SSETD4_9.1.3/lsf_admin/select_string.html
MESOS : http://mesos.apache.org/documentation/latest/attributes-resources/
KUBERNETES : http://kubernetes.io/docs/user-guide/node-selection/
Regarding concidering antiaffinity and affinity within the scope of
constraints, i had few queries/concerns
# current syntax what you mentioned captures for antiaffinity and affinity with
apps, but how about app's priority(different type of tasks) ? looks very
specific syntax needs to be introduced for that too ?
# According to YARN-1042 you had hinted out target as host/rack/partition. If
we start supporting within constraints i doubt we will be later consider these
kind of requirements in future, or start having specific notations into
constraint expression. To have better flexibility would opt it as separate
entity rather than club it here.
# Taking a look at the 1042 patch i understand that you want to make use of the
name of the application and numerical priority value itself as the constraint
names. This approach looks good but had few queries/apprehensions.
* consider the hbase case where in we assume that master be named as
Hbase_Master and regionserver priority is say 5, now while bulkloading i would
require the reducer to run in the location specific to a particular region
server rather than any regionserver, with this approach its not possible or
force hbase to use different priorities for different regionservers which cause
priority inversions etc . (Its just an example there are alternates to acheive
it but there can be other similar scenarios )
* IIUC we do not ensure appnames to be unique in the cluster, assume we want to
have 2 hbase setups in the same cluster then there should be a way to specify
affinity and anti affinity to individual hbase setup.
Initially when [~kkaranasos] mentioned for affinity and anti affinity i
visualized it like App can define constraints on the fly which will be valid
for the scope of application (and/container for different priorities) And RM
labels the nodes with the app specified constraint name and value when the node
runs the specified app's AM/container. So in this way we will be able to
specify affinity/anti affinity for multiple instances of same apps and its
different containers. Its not a completely thought approach but just my two
cents.
> Add constraint node labels
> --------------------------
>
> Key: YARN-3409
> URL: https://issues.apache.org/jira/browse/YARN-3409
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: api, capacityscheduler, client
> Reporter: Wangda Tan
> Assignee: Naganarasimha G R
> Attachments: Constraint-Node-Labels-Requirements-Design-doc_v1.pdf,
> YARN-3409.WIP.001.patch
>
>
> Specify only one label for each node (IAW, partition a cluster) is a way to
> determinate how resources of a special set of nodes could be shared by a
> group of entities (like teams, departments, etc.). Partitions of a cluster
> has following characteristics:
> - Cluster divided to several disjoint sub clusters.
> - ACL/priority can apply on partition (Only market team / marke team has
> priority to use the partition).
> - Percentage of capacities can apply on partition (Market team has 40%
> minimum capacity and Dev team has 60% of minimum capacity of the partition).
> Constraints are orthogonal to partition, they’re describing attributes of
> node’s hardware/software just for affinity. Some example of constraints:
> - glibc version
> - JDK version
> - Type of CPU (x86_64/i686)
> - Type of OS (windows, linux, etc.)
> With this, application can be able to ask for resource has (glibc.version >=
> 2.20 && JDK.version >= 8u20 && x86_64).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]