Sunil G commented on YARN-3784:

Hi [~Naganarasimha Garla]
Thank you for bringing in the thoughts, nothing is too late. :)
Yes, preemption is now only happening from PCPP. After YARN-3224, preemption 
will also happen from Decommissioning cases. So timeout b/w forceful killing of 
container is a varying entity. And in ideal cases, we assume to be greater than 
AM heartbeat interval (usual cases, it will be). Still we can hit with this 
corner cases. In existing system, AM skips such requests for preemption as 
containers might have been killed. This ticket is adding one more information 
which is timeout, along with list of containers. So its an added information 
for AM, but can skipped if containers are preempted already by AM.

I will see whether I can mark a timeout as 0 or -1 to indicate that those are 
already passed timeout. Hope this clarify the question. Pls let me know if its 

> Indicate preemption timout along with the list of containers to AM 
> (preemption message)
> ---------------------------------------------------------------------------------------
>                 Key: YARN-3784
>                 URL: https://issues.apache.org/jira/browse/YARN-3784
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: 0001-YARN-3784.patch, 0002-YARN-3784.patch, 
> 0003-YARN-3784.patch, 0004-YARN-3784.patch
> Currently during preemption, AM is notified with a list of containers which 
> are marked for preemption. Introducing a timeout duration also along with 
> this container list so that AM can know how much time it will get to do a 
> graceful shutdown to its containers (assuming one of preemption policy is 
> loaded in AM).
> This will help in decommissioning NM scenarios, where NM will be 
> decommissioned after a timeout (also killing containers on it). This timeout 
> will be helpful to indicate AM that those containers can be killed by RM 
> forcefully after the timeout.

This message was sent by Atlassian JIRA

Reply via email to