[ 
https://issues.apache.org/jira/browse/YARN-1712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14059590#comment-14059590
 ] 

Carlo Curino commented on YARN-1712:
------------------------------------

Attaching an interface, implementation and tests for the PlanFollower. 

The key idea is for this component to keep one (or more) Plan(s) in sync with 
the underlying scheduler. The implementation we present targets schedulers like 
Capacity and Fair schedulers that have a notion of queues and "entitlement" 
(term used to refer to both capacity and fair-scheduler weights). As per 
discussions with [~acmurthy] an alternative implementations could map plan's 
allocations to dynamically updated priorities within a single queue.

The reservation in our Plan are published to the scheduler by 
creating/destroying/resizing leaf queues (a special new kind of dynamically 
creatable/destroyable leaf queues, that are not required to be in sync with the 
config files). The policy is invoked on a timer, and makes a stateless 
comparison of scheduler queue states and plan current allocations, and compute 
deltas to determine what queues should be added, removed, resized. At the end 
of its execution the set of queues existing in the scheduler (under the 
ParentQueue corresponding to each Plan) correspond to the set of allocations 
active in the current instant in time, and the queue capacity correspond to the 
allocation size.

We operate in a stateless fashion (i.e., computing deltas on the fly, and not 
as play forward of an action log) so that we are resilient to various timing / 
queueing effects. If we are too slow or for whatever reason we are missing a 
time step, we will simply resync onto the next moment in time (or whenever we 
are invoked).

Conceptually this piece also provide a translation between reservation 
allocations, which are all expressed in "absolute" terms to queue 
capacity/weight configurations, which are all relative to current cluster 
capacity. This allows us to amortize fluctuations in cluster capacity (up to a 
point, beyond which the plan need to be reorganized). 

Like for other sub-patches of YARN-1051 I am marking this as patch-available to 
signal the fact that is ready to be reviewed, however the patch does NOT 
compile on its own.



> Admission Control: plan follower
> --------------------------------
>
>                 Key: YARN-1712
>                 URL: https://issues.apache.org/jira/browse/YARN-1712
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacityscheduler, resourcemanager
>            Reporter: Carlo Curino
>            Assignee: Carlo Curino
>         Attachments: YARN-1712.patch
>
>
> This JIRA tracks a thread that continuously propagates the current state of 
> an inventory subsystem to the scheduler. As the inventory subsystem store the 
> "plan" of how the resources should be subdivided, the work we propose in this 
> JIRA realizes such plan by dynamically instructing the CapacityScheduler to 
> add/remove/resize queues to follow the plan.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to