[jira] [Commented] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)

Hudson (JIRA) Thu, 11 Jul 2013 07:10:38 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13705860#comment-13705860
 ]


Hudson commented on YARN-569:
-----------------------------

Integrated in Hadoop-Mapreduce-trunk #1484 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1484/])
    YARN-569. Add support for requesting and enforcing preemption requests via
a capacity monitor. Contributed by Carlo Curino, Chris Douglas (Revision 
1502083)

     Result = SUCCESS
cdouglas : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1502083
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/records/Priority.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/SchedulingEditPolicy.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/SchedulingMonitor.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/ProportionalCapacityPreemptionPolicy.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/AppSchedulingInfo.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/ContainerPreemptEvent.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/ContainerPreemptEventType.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/PreemptableResourceScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerNode.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/monitor/capacity/TestProportionalCapacityPreemptionPolicy.java

                
> CapacityScheduler: support for preemption (using a capacity monitor)
> --------------------------------------------------------------------
>
>                 Key: YARN-569
>                 URL: https://issues.apache.org/jira/browse/YARN-569
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacityscheduler
>            Reporter: Carlo Curino
>            Assignee: Carlo Curino
>         Attachments: 3queues.pdf, CapScheduler_with_preemption.pdf, 
> preemption.2.patch, YARN-569.10.patch, YARN-569.11.patch, YARN-569.1.patch, 
> YARN-569.2.patch, YARN-569.3.patch, YARN-569.4.patch, YARN-569.5.patch, 
> YARN-569.6.patch, YARN-569.8.patch, YARN-569.9.patch, YARN-569.patch, 
> YARN-569.patch
>
>
> There is a tension between the fast-pace reactive role of the 
> CapacityScheduler, which needs to respond quickly to 
> applications resource requests, and node updates, and the more introspective, 
> time-based considerations 
> needed to observe and correct for capacity balance. To this purpose we opted 
> instead of hacking the delicate
> mechanisms of the CapacityScheduler directly to add support for preemption by 
> means of a "Capacity Monitor",
> which can be run optionally as a separate service (much like the 
> NMLivelinessMonitor).
> The capacity monitor (similarly to equivalent functionalities in the fairness 
> scheduler) operates running on intervals 
> (e.g., every 3 seconds), observe the state of the assignment of resources to 
> queues from the capacity scheduler, 
> performs off-line computation to determine if preemption is needed, and how 
> best to "edit" the current schedule to 
> improve capacity, and generates events that produce four possible actions:
> # Container de-reservations
> # Resource-based preemptions
> # Container-based preemptions
> # Container killing
> The actions listed above are progressively more costly, and it is up to the 
> policy to use them as desired to achieve the rebalancing goals. 
> Note that due to the "lag" in the effect of these actions the policy should 
> operate at the macroscopic level (e.g., preempt tens of containers
> from a queue) and not trying to tightly and consistently micromanage 
> container allocations. 
> ------------- Preemption policy  (ProportionalCapacityPreemptionPolicy): 
> ------------- 
> Preemption policies are by design pluggable, in the following we present an 
> initial policy (ProportionalCapacityPreemptionPolicy) we have been 
> experimenting with.  The ProportionalCapacityPreemptionPolicy behaves as 
> follows:
> # it gathers from the scheduler the state of the queues, in particular, their 
> current capacity, guaranteed capacity and pending requests (*)
> # if there are pending requests from queues that are under capacity it 
> computes a new ideal balanced state (**)
> # it computes the set of preemptions needed to repair the current schedule 
> and achieve capacity balance (accounting for natural completion rates, and 
> respecting bounds on the amount of preemption we allow for each round)
> # it selects which applications to preempt from each over-capacity queue (the 
> last one in the FIFO order)
> # it remove reservations from the most recently assigned app until the amount 
> of resource to reclaim is obtained, or until no more reservations exits
> # (if not enough) it issues preemptions for containers from the same 
> applications (reverse chronological order, last assigned container first) 
> again until necessary or until no containers except the AM container are left,
> # (if not enough) it moves onto unreserve and preempt from the next 
> application. 
> # containers that have been asked to preempt are tracked across executions. 
> If a containers is among the one to be preempted for more than a certain 
> time, the container is moved in a the list of containers to be forcibly 
> killed. 
> Notes:
> (*) at the moment, in order to avoid double-counting of the requests, we only 
> look at the "ANY" part of pending resource requests, which means we might not 
> preempt on behalf of AMs that ask only for specific locations but not any. 
> (**) The ideal balance state is one in which each queue has at least its 
> guaranteed capacity, and the spare capacity is distributed among queues (that 
> wants some) as a weighted fair share. Where the weighting is based on the 
> guaranteed capacity of a queue, and the function runs to a fix point.  
> Tunables of the ProportionalCapacityPreemptionPolicy:
> #     observe-only mode (i.e., log the actions it would take, but behave as 
> read-only)
> # how frequently to run the policy
> # how long to wait between preemption and kill of a container
> # which fraction of the containers I would like to obtain should I preempt 
> (has to do with the natural rate at which containers are returned)
> # deadzone size, i.e., what % of over-capacity should I ignore (if we are off 
> perfect balance by some small % we ignore it)
> # overall amount of preemption we can afford for each run of the policy (in 
> terms of total cluster capacity)
> In our current experiments this set of tunables seem to be a good start to 
> shape the preemption action properly. More sophisticated preemption policies 
> could take into account different type of applications running, job 
> priorities, cost of preemption, integral of capacity imbalance. This is very 
> much a control-theory kind of problem, and some of the lessons on designing 
> and tuning controllers are likely to apply.
> Generality:
> The monitor-based scheduler edit, and the preemption mechanisms we introduced 
> here are designed to be more general than enforcing capacity/fairness, in 
> fact, we are considering other monitors that leverage the same idea of 
> "schedule edits" to target different global properties (e.g., allocate enough 
> resources to guarantee deadlines for important jobs, or data-locality 
> optimizations, IO-balancing among nodes, etc...).
> Note that by default the preemption policy we describe is disabled in the 
> patch.
> Depends on YARN-45 and YARN-567, is related to YARN-568

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-569) CapacityScheduler: support for preemption (using a capacity monitor)

Reply via email to