[jira] [Updated] (YARN-4390) Do surgical preemption based on reserved container in CapacityScheduler

Wangda Tan (JIRA) Tue, 26 Apr 2016 16:20:06 -0700

     [ 
https://issues.apache.org/jira/browse/YARN-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Wangda Tan updated YARN-4390:
-----------------------------
    Attachment: YARN-4390.7.patch

For comments from [~jianhe],
bq. why is this exception changed to be ignored ? (combine this try-catch 
clause with the one underneath)
Because we don't want to fail RM because of preemption policy's failure.

bq. code does not match comment? comment says not 
considersReservedResourceWhenCalculateIdeal
Updated, it is actually added to wrong place: instead of allowing preemption 
more than total_preemptable. We should not apply nature_termination_factor for 
reserved-preemption-candidates-selector.

In addtion (cc: [~kasha])
For changes of SchedulerNode, I run benchmark tests without volatile changes. 
So all fields of SchedulerNode kept to be synchronized. I can still get similar 
result: for a 1000 nodes cluster, each run of preemption policy takes ~10 ms.

So I removed all volatile/ConcurrentMap changes of 
SchedulerNode/FiCaSchedulerNode. Only kept few cosmetic changes, please let me 
know your thoughts.

> Do surgical preemption based on reserved container in CapacityScheduler
> -----------------------------------------------------------------------
>
>                 Key: YARN-4390
>                 URL: https://issues.apache.org/jira/browse/YARN-4390
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacity scheduler
>    Affects Versions: 3.0.0, 2.8.0, 2.7.3
>            Reporter: Eric Payne
>            Assignee: Wangda Tan
>         Attachments: QueueNotHittingMax.jpg, YARN-4390-design.1.pdf, 
> YARN-4390-test-results.pdf, YARN-4390.1.patch, YARN-4390.2.patch, 
> YARN-4390.3.branch-2.patch, YARN-4390.3.patch, YARN-4390.4.patch, 
> YARN-4390.5.patch, YARN-4390.6.patch, YARN-4390.7.patch
>
>
> There are multiple reasons why preemption could unnecessarily preempt 
> containers. One is that an app could be requesting a large container (say 
> 8-GB), and the preemption monitor could conceivably preempt multiple 
> containers (say 8, 1-GB containers) in order to fill the large container 
> request. These smaller containers would then be rejected by the requesting AM 
> and potentially given right back to the preempted app.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (YARN-4390) Do surgical preemption based on reserved container in CapacityScheduler

Reply via email to