Sunil G commented on YARN-2022:

Thank you Carlo for the clarifications on am-priority and user-limit-factor.

I agree with your point on a possible tampering on container priority as 0. On 
this point, I feel your option 1 may be more ideal (track which container is AM 
not via Priority).
Because even with option 2, AM container has to be found first from multiple 
containers at Priority=0. In this case save AM first and then save other 
containers max possible, may not be much suitable with many applications marked 
for preemption.

When an AM container is launched, RM has to set a way to mark it as an AM 
CapacityScheduler has RMContext, and may be from that with ApplicationAttemptID 
we can get MasterContainer. I feel this may be little complex look-up. Rather 
it is better to set some property directly on a container to mark as 

Also with user-limit-factor and max-user-percentage, scheduler keeps skipping 
containers and such an AM is asking for containers again are not so good. And 
if this AM is a "savedAM" from preemption, it will be even bad. For this also 
we can place a checkpoint decision whether to save or not.
        So to summarize roughly, 
                1)  A better marking for finding AM container is needed. [Can 
see whether this can be extendable to save multiple container of low priority 
                2)  A checkpoint has to be derived based on below factors to 
save an AM or not
                        a. max-am-percentage limit has to be honored.
                        b. user-limit-factor or max-user-percentage also has to 
be checked.

I can first try to post a design approach on deriving checkpoint decision from 
both a. and b. above. Please share more thoughts if any on this.

> Preempting an Application Master container can be kept as least priority when 
> multiple applications are marked for preemption by 
> ProportionalCapacityPreemptionPolicy
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------
>                 Key: YARN-2022
>                 URL: https://issues.apache.org/jira/browse/YARN-2022
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 2.4.0
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: Yarn-2022.1.patch
> Cluster Size = 16GB [2NM's]
> Queue A Capacity = 50%
> Queue B Capacity = 50%
> Consider there are 3 applications running in Queue A which has taken the full 
> cluster capacity. 
> J1 = 2GB AM + 1GB * 4 Maps
> J2 = 2GB AM + 1GB * 4 Maps
> J3 = 2GB AM + 1GB * 2 Maps
> Another Job J4 is submitted in Queue B [J4 needs a 2GB AM + 1GB * 2 Maps ].
> Currently in this scenario, Jobs J3 will get killed including its AM.
> It is better if AM can be given least priority among multiple applications. 
> In this same scenario, map tasks from J3 and J2 can be preempted.
> Later when cluster is free, maps can be allocated to these Jobs.

This message was sent by Atlassian JIRA

Reply via email to