[ https://issues.apache.org/jira/browse/YARN-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067503#comment-15067503 ]
Carlo Curino commented on YARN-4195: ------------------------------------ [~leftnoteasy], good questions: # We can honor the request in best-effort mode. We can't guarantee we will find nodes that match, but if one is found (and not promised to other) we can give it to the user. # The assignment is independent of capacity promises, so goes through normally (am I missing something in this question?). The reservation-system has mechanisms to react to changes in amount of capacity under each partition/label, handling the case of a dynamically changing label _after_ a job grabbed the container is out-of-scope. The combinatorial explosion is even worse, I think is the powerset of labels, so {{|partitions| = 2^N}}, _where N is the number of base labels_. The good news is that we do not automatically explode all combinations, but only focus on *"active" partitions*, i.e., unique combinations of labels that are associated with least one node. This means that we have a hard upper-bound {{|partitions| <= K}} where _K is the number of nodes in the cluster_. We should run some tests, but it is possible that YARN-4476 's algos are efficient enough to deal with this even for large clusters (e.g., {{K=5000}}). {{|partitions| = K}} would be the norm if we decide to unify the notion of labels with the one of locality (I.e., a machine name is nothing but a label, and so is the rack). If however, we do not converge node-labels and locality, I would expect that in most clusters, we could find groups of nodes which are fungible, i.e., they form an equivalence-class (w.r.t. the set of labels they share) of size >1. This is like saying that those nodes are "indistinguishable" for the user (this used to be true for all nodes in a cluster bar locality). In some of our data centers we saw unique partitions in the order of {{K/100 < |partitions| < K/20}}---for the set of labels we cared about in those settings. (Mileage will heavily vary based on the semantics/use of labels you consider). > Support of node-labels in the ReservationSystem "Plan" > ------------------------------------------------------ > > Key: YARN-4195 > URL: https://issues.apache.org/jira/browse/YARN-4195 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Carlo Curino > Assignee: Carlo Curino > Attachments: YARN-4195.patch > > > As part of YARN-4193 we need to enhance the InMemoryPlan (and related > classes) to track the per-label available resources, as well as the per-label > reservation-allocations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)