[ 
https://issues.apache.org/jira/browse/YARN-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16085035#comment-16085035
 ] 

Arun Suresh edited comment on YARN-6808 at 7/13/17 1:52 AM:
------------------------------------------------------------

[~leftnoteasy], Thanks for taking a look.

This patch basically provides an alternative to RM side preemption by 
allocating containers over queue capacity as opportunistic containers. This can 
provide some advantages:
Since the over capacity containers are opportunistic, it does not count towards 
the queue's used capacity.
* This means that starved apps on queues whose resources were encroached upon 
will be granted guaranteed containers immediately. They do not have to wait for 
a container to get preempted before a container is allocated.
* Opportunistic containers that were given to apps who asked for resources 
above it's queue capacity can continue running until a Guaranteed container 
that is allocated on the node is *started* by the AM. And that too, it will be 
killed only if there is no room on the node.
* We believe, this along with container pause feature YARN-5972 will lead to 
better good put and utilization. We are running experiments currently and will 
keep you posted.

As far as interaction with other scheduler components are concerned - It is 
fairly minimal. What it currently does is, before the ResourceRequests are 
updated in the AppSchedulingInfo, it checks if the asked ResourceRequest will 
exceed the app's headroom. If yes, it will just return an Opportunistic 
container. (There are some other nuances which we ignore currently - like 
locality)

One other motivation for us was that currently, an AM has to explicitly ask the 
RM for an OPP container - which most AM's currently are not capable of doing. 
This patch offers one instance by which an RM can give OPPORTUNISTIC containers 
to an AM based on cluster load etc.

BTW. I think the testcase would probably give some hint of how it is to be 
used. Also, I guess I need to introduce a conf flag to turn on the feature.


was (Author: asuresh):
[~leftnoteasy], Thanks for taking a look.

This patch basically provides an alternative to RM side preemption by 
allocating containers over queue capacity as opportunistic containers. This can 
provide some advantages:
Since the over capacity containers are opportunistic, it does not count towards 
the queue's used capacity.
* This means that starved apps on queues whose resources were encroached upon 
will be granted guaranteed containers immediately. They do not have to wait for 
a container to get preempted before a container is allocated.
* Opportunistic containers that were given to apps who asked for resources 
above it's queue capacity can continue running until a Guaranteed container 
that is allocated on the node is *started* by the AM. And that too, it will be 
killed only if there is no room on the node.
* We believe, this along with container pause feature YARN-5972 will lead to 
better good put and utilization. We are running experiments currently and will 
keep you posted.

As far as interaction with other scheduler components are concerned - It is 
fairly minimal. What it currently does is, before the ResourceRequests are 
updated in the AppSchedulingInfo, it checks if the asked ResourceRequest will 
exceed the app's headroom. If yes, it will just return an Opportunistic 
container. (There are some other nuances which we ignore currently - like 
locality)

One other motivation for us was that currently, an AM has to explicitly ask the 
RM for an OPP container - which most AM's currently are not capable of doing. 
This patch offers one instance by which an RM can give OPPORTUNISTIC containers 
to an AM based on cluster load etc.

> Allow Schedulers to return OPPORTUNISTIC containers when queues go over 
> configured capacity
> -------------------------------------------------------------------------------------------
>
>                 Key: YARN-6808
>                 URL: https://issues.apache.org/jira/browse/YARN-6808
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>         Attachments: YARN-6808.001.patch
>
>
> This is based on discussions with [~kasha] and [~kkaranasos].
> Currently, when a Queues goes over capacity, apps on starved queues must wait 
> either for containers to complete or for them to be pre-empted by the 
> scheduler to get resources.
> This JIRA proposes to allow Schedulers to:
> # Allocate all containers over the configured queue capacity/weight as 
> OPPORTUNISTIC.
> # Auto-promote running OPPORTUNISTIC containers of apps as and when their 
> GUARANTEED containers complete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to