[ 
https://issues.apache.org/jira/browse/YARN-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14630167#comment-14630167
 ] 

Wangda Tan commented on YARN-3784:
----------------------------------

Beyond timeout, another thing we may need consider is: after a container is 
removed from to-be-preempted list, should we notify scheduler/AM about that? 
This could happen if other applications release containers, or other 
queues/applications cancel resource requests.

Now proportionalCPP can notify scheduler many times for a same container, if we 
have to-preempt/remove-from-to-preempt event, we can also reduce number of 
messages send to scheduler (which could cause YARN-3508 happens).

> Indicate preemption timout along with the list of containers to AM 
> (preemption message)
> ---------------------------------------------------------------------------------------
>
>                 Key: YARN-3784
>                 URL: https://issues.apache.org/jira/browse/YARN-3784
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: 0001-YARN-3784.patch, 0002-YARN-3784.patch
>
>
> Currently during preemption, AM is notified with a list of containers which 
> are marked for preemption. Introducing a timeout duration also along with 
> this container list so that AM can know how much time it will get to do a 
> graceful shutdown to its containers (assuming one of preemption policy is 
> loaded in AM).
> This will help in decommissioning NM scenarios, where NM will be 
> decommissioned after a timeout (also killing containers on it). This timeout 
> will be helpful to indicate AM that those containers can be killed by RM 
> forcefully after the timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to