[jira] [Comment Edited] (YARN-6808) Allow Schedulers to return OPPORTUNISTIC containers when queues go over configured capacity

Wangda Tan (JIRA) Fri, 14 Jul 2017 11:50:25 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-6808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16087782#comment-16087782
 ]


Wangda Tan edited comment on YARN-6808 at 7/14/17 6:49 PM:
-----------------------------------------------------------

Thanks [~asuresh] for the additional explanations.

Frankly speaking, I'm a little bit worried about this direction.

Currently, opportunistic container allocation is too simple to follow what 
existing schedulers can do. For example, delayed scheduling, user limit. You 
might want to say that there's no guarantee for opportunistic container so we 
don't have to follow all of them. This is tree under the context of YARN-2877, 
which is target to run container with short life and don't need any guarantees. 
However, if we move all containers beyond queue's configured capacity to queue 
opportunistic, it is a big deal. I'm not sure if you have any plans regarding 
to this.

One related topic is YARN-1011: Opportunistic containers will be used when node 
is overcommitted. YARN-1013/YARN-1015 are opened to track changes in scheduler 
side. To me it's better to reuse existing scheduler code instead of duplicating 
code to achieve this. I'm not sure is there any concrete plans for this. cc: 
[~haibo.chen], [~kasha].

If like you said, use this approach to replace existing preemption, I think 
this patch is far away from the goal. Even with the latest patch, it can only 
mark containers which are allocated below queue limit and which are allocated 
above queue limit. (Which could be wrong if queue's configured capacity 
changed). In that case, I don't know how it is possible to support existing 
preemption features such as preemption large containers, intra-queue preemption 
for app priority / user limit, etc.

bq. Essentially, the extra delay will look just like localization delay to the 
AM. We have verified this is fine for MapReduce and Spark. 
To me this is too simple to solve the problem. Yes existing MR/Spark are able 
to expect delays when containers are localizing, but in most cases, 
localization happens only once on each node because files need to be localized 
are mostly common between containers from the same app (or same set of task 
like mappers). It looks like in a large cluster, delays need to be expected for 
every OC launch. This is not acceptable for SLA-sensitive jobs.

I don't want to stop or slow down this innovation, but from what I can see, 
existing assumptions are plans need lots of time to be verified. In this case, 
like YARN-1011, do you think is it better to create an umbrella JIRA (use 
opportunistic container to do preemption). And move related works to branch? 

Edit #1:

Probably the new umbrella should be: Use OC for normal containers. (YARN-2877 
are not target to use it for normal containers).


was (Author: leftnoteasy):
Thanks [~asuresh] for the additional explanations.

Frankly speaking, I'm a little bit worried about this direction.

Currently, opportunistic container allocation is too simple to follow what 
existing schedulers can do. For example, delayed scheduling, user limit. You 
might want to say that there's no guarantee for opportunistic container so we 
don't have to follow all of them. This is tree under the context of YARN-2877, 
which is target to run container with short life and don't need any guarantees. 
However, if we move all containers beyond queue's configured capacity to queue 
opportunistic, it is a big deal. I'm not sure if you have any plans regarding 
to this.

One related topic is YARN-1011: Opportunistic containers will be used when node 
is overcommitted. YARN-1013/YARN-1015 are opened to track changes in scheduler 
side. To me it's better to reuse existing scheduler code instead of duplicating 
code to achieve this. I'm not sure is there any concrete plans for this. cc: 
[~haibo.chen], [~kasha].

If like you said, use this approach to replace existing preemption, I think 
this patch is far away from the goal. Even with the latest patch, it can only 
mark containers which are allocated below queue limit and which are allocated 
above queue limit. (Which could be wrong if queue's configured capacity 
changed). In that case, I don't know how it is possible to support existing 
preemption features such as preemption large containers, intra-queue preemption 
for app priority / user limit, etc.

bq. Essentially, the extra delay will look just like localization delay to the 
AM. We have verified this is fine for MapReduce and Spark. 
To me this is too simple to solve the problem. Yes existing MR/Spark are able 
to expect delays when containers are localizing, but in most cases, 
localization happens only once on each node because files need to be localized 
are mostly common between containers from the same app (or same set of task 
like mappers). It looks like in a large cluster, delays need to be expected for 
every OC launch. This is not acceptable for SLA-sensitive jobs.

I don't want to stop or slow down this innovation, but from what I can see, 
existing assumptions are plans need lots of time to be verified. In this case, 
like YARN-1011, do you think is it better to create an umbrella JIRA (use 
opportunistic container to do preemption). And move related works to branch? 

> Allow Schedulers to return OPPORTUNISTIC containers when queues go over 
> configured capacity
> -------------------------------------------------------------------------------------------
>
>                 Key: YARN-6808
>                 URL: https://issues.apache.org/jira/browse/YARN-6808
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>         Attachments: YARN-6808.001.patch, YARN-6808.002.patch
>
>
> This is based on discussions with [~kasha] and [~kkaranasos].
> Currently, when a Queues goes over capacity, apps on starved queues must wait 
> either for containers to complete or for them to be pre-empted by the 
> scheduler to get resources.
> This JIRA proposes to allow Schedulers to:
> # Allocate all containers over the configured queue capacity/weight as 
> OPPORTUNISTIC.
> # Auto-promote running OPPORTUNISTIC containers of apps as and when their 
> GUARANTEED containers complete.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (YARN-6808) Allow Schedulers to return OPPORTUNISTIC containers when queues go over configured capacity

Reply via email to