[
https://issues.apache.org/jira/browse/YARN-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16021942#comment-16021942
]
Daniel Templeton commented on YARN-3409:
----------------------------------------
Sorry for coming late to the conversation. Last week I had a quick chat with
[~Naganarasimha] offline about the plans, and I wanted to share an alternate
perspective.
If you go look at the way HPC job schedulers (like Grid Engine et al) handle
this requirement, it's an extension of resources. The work that [~vvasudev]
has done on resource types opens up a natural path to add "static" resource
types with the characteristics described here. The advantage is that the
plumbing for resources is already very mature, and extending it to support
static resources would not introduce much in the way of new logic. The
implementation of constraints then naturally becomes a superset of resource
matching for the consumable resources. The disadvantage that [~Naganarasimha]
pointed out is that users would have to understand that resources can be static
or consumable, which is a higher bar than just asserting that all resources are
consumable. Given that all the major HPC job schedulers have been using static
resources for this purpose successfully for decades, I don't see that being a
major issue.
To add a little more detail, here's the what Grid Engine does (that's relevant
to us). (See
http://gridscheduler.sourceforge.net/htmlman/htmlman5/complex.html)
* All resources have a type, e.g. string, double, boolean, etc.
* All resources have an associated relational operator. For example the memory
resource has >= as a relational operator, meaning that a request for 4GB of
memory is treated as >= 4GB of memory. In general, resources can only be
meaningfully compared one direction.
* All resources are either consumable or static. Only numeric resources can be
consumable.
* Memory and CPU (and a couple others) are provided implicitly by the system.
* It's possible to configure the agents to run scripts periodically to
programmatically determine values for any resources. Consumable resources
decrement from that value.
* The scheduler uses the relational operator for all resources to determine
whether resource requests fit a destination queue/host.
Putting static resources and consumables in the same boat saves a fair bit of
logic duplication in implementing things like programmatically determined
values.
> Add constraint node labels
> --------------------------
>
> Key: YARN-3409
> URL: https://issues.apache.org/jira/browse/YARN-3409
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: api, capacityscheduler, client
> Reporter: Wangda Tan
> Assignee: Naganarasimha G R
> Attachments: Constraint-Node-Labels-Requirements-Design-doc_v1.pdf,
> YARN-3409.WIP.001.patch
>
>
> Specify only one label for each node (IAW, partition a cluster) is a way to
> determinate how resources of a special set of nodes could be shared by a
> group of entities (like teams, departments, etc.). Partitions of a cluster
> has following characteristics:
> - Cluster divided to several disjoint sub clusters.
> - ACL/priority can apply on partition (Only market team / marke team has
> priority to use the partition).
> - Percentage of capacities can apply on partition (Market team has 40%
> minimum capacity and Dev team has 60% of minimum capacity of the partition).
> Constraints are orthogonal to partition, they’re describing attributes of
> node’s hardware/software just for affinity. Some example of constraints:
> - glibc version
> - JDK version
> - Type of CPU (x86_64/i686)
> - Type of OS (windows, linux, etc.)
> With this, application can be able to ask for resource has (glibc.version >=
> 2.20 && JDK.version >= 8u20 && x86_64).
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]