[
https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13705860#comment-13705860
]
Hudson commented on YARN-569:
-----------------------------
Integrated in Hadoop-Mapreduce-trunk #1484 (See
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1484/])
YARN-569. Add support for requesting and enforcing preemption requests via
a capacity monitor. Contributed by Carlo Curino, Chris Douglas (Revision
1502083)
Result = SUCCESS
cdouglas :
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1502083
Files :
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/Priority.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/SchedulingEditPolicy.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/SchedulingMonitor.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ProportionalCapacityPreemptionPolicy.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/ContainerPreemptEvent.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/ContainerPreemptEventType.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/PreemptableResourceScheduler.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerNode.java
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity
*
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java
> CapacityScheduler: support for preemption (using a capacity monitor)
> --------------------------------------------------------------------
>
> Key: YARN-569
> URL: https://issues.apache.org/jira/browse/YARN-569
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: capacityscheduler
> Reporter: Carlo Curino
> Assignee: Carlo Curino
> Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf,
> preemption.2.patch, YARN-569.10.patch, YARN-569.11.patch, YARN-569.1.patch,
> YARN-569.2.patch, YARN-569.3.patch, YARN-569.4.patch, YARN-569.5.patch,
> YARN-569.6.patch, YARN-569.8.patch, YARN-569.9.patch, YARN-569.patch,
> YARN-569.patch
>
>
> There is a tension between the fast-pace reactive role of the
> CapacityScheduler, which needs to respond quickly to
> applications resource requests, and node updates, and the more introspective,
> time-based considerations
> needed to observe and correct for capacity balance. To this purpose we opted
> instead of hacking the delicate
> mechanisms of the CapacityScheduler directly to add support for preemption by
> means of a "Capacity Monitor",
> which can be run optionally as a separate service (much like the
> NMLivelinessMonitor).
> The capacity monitor (similarly to equivalent functionalities in the fairness
> scheduler) operates running on intervals
> (e.g., every 3 seconds), observe the state of the assignment of resources to
> queues from the capacity scheduler,
> performs off-line computation to determine if preemption is needed, and how
> best to "edit" the current schedule to
> improve capacity, and generates events that produce four possible actions:
> # Container de-reservations
> # Resource-based preemptions
> # Container-based preemptions
> # Container killing
> The actions listed above are progressively more costly, and it is up to the
> policy to use them as desired to achieve the rebalancing goals.
> Note that due to the "lag" in the effect of these actions the policy should
> operate at the macroscopic level (e.g., preempt tens of containers
> from a queue) and not trying to tightly and consistently micromanage
> container allocations.
> ------------- Preemption policy (ProportionalCapacityPreemptionPolicy):
> -------------
> Preemption policies are by design pluggable, in the following we present an
> initial policy (ProportionalCapacityPreemptionPolicy) we have been
> experimenting with. The ProportionalCapacityPreemptionPolicy behaves as
> follows:
> # it gathers from the scheduler the state of the queues, in particular, their
> current capacity, guaranteed capacity and pending requests (*)
> # if there are pending requests from queues that are under capacity it
> computes a new ideal balanced state (**)
> # it computes the set of preemptions needed to repair the current schedule
> and achieve capacity balance (accounting for natural completion rates, and
> respecting bounds on the amount of preemption we allow for each round)
> # it selects which applications to preempt from each over-capacity queue (the
> last one in the FIFO order)
> # it remove reservations from the most recently assigned app until the amount
> of resource to reclaim is obtained, or until no more reservations exits
> # (if not enough) it issues preemptions for containers from the same
> applications (reverse chronological order, last assigned container first)
> again until necessary or until no containers except the AM container are left,
> # (if not enough) it moves onto unreserve and preempt from the next
> application.
> # containers that have been asked to preempt are tracked across executions.
> If a containers is among the one to be preempted for more than a certain
> time, the container is moved in a the list of containers to be forcibly
> killed.
> Notes:
> (*) at the moment, in order to avoid double-counting of the requests, we only
> look at the "ANY" part of pending resource requests, which means we might not
> preempt on behalf of AMs that ask only for specific locations but not any.
> (**) The ideal balance state is one in which each queue has at least its
> guaranteed capacity, and the spare capacity is distributed among queues (that
> wants some) as a weighted fair share. Where the weighting is based on the
> guaranteed capacity of a queue, and the function runs to a fix point.
> Tunables of the ProportionalCapacityPreemptionPolicy:
> # observe-only mode (i.e., log the actions it would take, but behave as
> read-only)
> # how frequently to run the policy
> # how long to wait between preemption and kill of a container
> # which fraction of the containers I would like to obtain should I preempt
> (has to do with the natural rate at which containers are returned)
> # deadzone size, i.e., what % of over-capacity should I ignore (if we are off
> perfect balance by some small % we ignore it)
> # overall amount of preemption we can afford for each run of the policy (in
> terms of total cluster capacity)
> In our current experiments this set of tunables seem to be a good start to
> shape the preemption action properly. More sophisticated preemption policies
> could take into account different type of applications running, job
> priorities, cost of preemption, integral of capacity imbalance. This is very
> much a control-theory kind of problem, and some of the lessons on designing
> and tuning controllers are likely to apply.
> Generality:
> The monitor-based scheduler edit, and the preemption mechanisms we introduced
> here are designed to be more general than enforcing capacity/fairness, in
> fact, we are considering other monitors that leverage the same idea of
> "schedule edits" to target different global properties (e.g., allocate enough
> resources to guarantee deadlines for important jobs, or data-locality
> optimizations, IO-balancing among nodes, etc...).
> Note that by default the preemption policy we describe is disabled in the
> patch.
> Depends on YARN-45 and YARN-567, is related to YARN-568
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira