[ 
https://issues.apache.org/jira/browse/YARN-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yufei Gu updated YARN-6793:
---------------------------
    Description: 
There is a delay between preemption happen and containers are killed. If 
resources released from nodes before container killing are not enough for the 
resource request preemption asking for, reservation happens again at that node.
E.g. scheduler reserves <memory 2048, vcore 2> in node 1 for app 1 while 
preemption. It will take 15s by default to kill containers in node 1 for 
fulfill that resource requests. If <memory 1024, vcore 1> was released from 
node 1 before the killing, scheduler reserves <memory 2048, vcore 2> again in 
node 1 for app1. The second reservation may never be unreserved. 

  was:
There is a delay between preemption happen and containers are killed. If 
resources released from nodes before container killing are not enough for the 
resource request preemption asking for, reservation happens again at that node.
E.g. scheduler reserves <memory 2048, vcore 2> in node 1 for app 1. It will 
take 15s by default to kill containers in node 1 for fulfill that resource 
requests. If <memory 1024, vcore 1> was released from node 1 before the 
killing, scheduler reserves <memory 2048, vcore 2> again in node 1 for app1. 
The second reservation may never be unreserved. 


> Duplicated reservation in Fair Scheduler preemption 
> ----------------------------------------------------
>
>                 Key: YARN-6793
>                 URL: https://issues.apache.org/jira/browse/YARN-6793
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 2.8.1, 3.0.0-alpha3
>            Reporter: Yufei Gu
>            Assignee: Yufei Gu
>            Priority: Critical
>
> There is a delay between preemption happen and containers are killed. If 
> resources released from nodes before container killing are not enough for the 
> resource request preemption asking for, reservation happens again at that 
> node.
> E.g. scheduler reserves <memory 2048, vcore 2> in node 1 for app 1 while 
> preemption. It will take 15s by default to kill containers in node 1 for 
> fulfill that resource requests. If <memory 1024, vcore 1> was released from 
> node 1 before the killing, scheduler reserves <memory 2048, vcore 2> again in 
> node 1 for app1. The second reservation may never be unreserved. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to