[ 
https://issues.apache.org/jira/browse/YARN-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15364893#comment-15364893
 ] 

Hitesh Sharma commented on YARN-5216:
-------------------------------------

[~asuresh], [~kkaranasos]], thank you for the feedback and comments.

Regarding the refactoring being done and the reason to pull queues into the 
currently named {{OpportunisticContainerManager}}:

Roughly speaking the {{QueuingContainersManagerImpl}} does the following for 
starting and stopping opportunistic containers:


* A running container simply gets preempted while a container waiting in the 
queue is removed and RM is notified to reallocate it elsewhere.
* Periodically it is checked if there are too many waiting containers in the 
queue and they are removed so RM can rebalance them. 
* When a running container finishes then a waiting opportunistic container will 
be run if there are no guaranteed waiting in the queue. 

If the preemption policy is to kill the container then things are a little 
simpler and you can leave the opportunistic container queue within 
{{QueuingContainersManagerImpl}}. However if the preemption policy is different 
then we need extension points to know about the operations that the 
{{QueuingContainersManagerImpl}} wants to do and respond appropriately. Say the 
preemption policy is to put the container in a pause state so that it can be 
resumed once there is some room to run a container. This requires to 
distinguish between whether the {{QueuingContainersManagerImpl}} is looking to 
run pending containers (e.g. we want to resume a preempted container over an OC 
which is still waiting in the queue) or is looking to rebalance waiting 
containers to other nodes (e.g. we can't reallocate a container in the pause 
state). For pretty much these reasons the pluggable policy is named as 
{{OpportunisticContainerManager}} as it allows you to preempt and start the 
opportunistic containers and also manages the queue of the opportunistic 
containers. I'm open to suggestion on how to do this differently without having 
to change {{QueuingContainersManagerImpl}} a lot.

[~asuresh], can you elaborate a little why {{queuedGuaranteedContainers}} 
should also be pulled into the {{OpportunisticContainerManagerImpl}}?

I will look into using ServiceLoader framework over reflection and add an extra 
constant to determine the default value.

Thank a lot for the feedback and comments.

> Expose configurable preemption policy for OPPORTUNISTIC containers running on 
> the NM
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-5216
>                 URL: https://issues.apache.org/jira/browse/YARN-5216
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Arun Suresh
>            Assignee: Hitesh Sharma
>         Attachments: YARN5216.001.patch, yarn5216.002.patch
>
>
> Currently, the default action taken by the QueuingContainerManager, 
> introduced in YARN-2883, when a GUARANTEED Container is scheduled on an NM 
> with OPPORTUNISTIC containers using up resources, is to KILL the running 
> OPPORTUNISTIC containers.
> This JIRA proposes to expose a configurable hook to allow the NM to take a 
> different action.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to