[ https://issues.apache.org/jira/browse/YARN-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sunil G updated YARN-3784: -------------------------- Attachment: 0001-YARN-3784.patch Uploading an initial version. As per existing preemption framework, AM will fetch the conatiners which are to be preempted during each allocate call. Along with this, AM can also fetch a proposed possible time duration, after which RM will forcefully kill those containers. This patch is not updating preemption timeout per container level, rather it is giving per application level. So for all those preempted containers within a heartbeat duration, timeout will be common. If 2 types of containers are marked for preemption with different timeout, lowest one will be updated. If we provide timeout per container level, we need to change the interface of list of containers to a map <containerId, timeOut>. Please share your thoughts on this point. > Indicate preemption timout along with the list of containers to AM > (preemption message) > --------------------------------------------------------------------------------------- > > Key: YARN-3784 > URL: https://issues.apache.org/jira/browse/YARN-3784 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Reporter: Sunil G > Assignee: Sunil G > Attachments: 0001-YARN-3784.patch > > > Currently during preemption, AM is notified with a list of containers which > are marked for preemption. Introducing a timeout duration also along with > this container list so that AM can know how much time it will get to do a > graceful shutdown to its containers (assuming one of preemption policy is > loaded in AM). > This will help in decommissioning NM scenarios, where NM will be > decommissioned after a timeout (also killing containers on it). This timeout > will be helpful to indicate AM that those containers can be killed by RM > forcefully after the timeout. -- This message was sent by Atlassian JIRA (v6.3.4#6332)