Sunil G commented on YARN-3784:

Thankyou [~chris.douglas] for the comments.
I will update a patch correcting  these problems. Regarding below point,
bq.If containers are preempted for multiple causes (e.g., over-capacity, NM 
decommission), then the time to preempt could vary widely
My concern also was same. Currently preemption message will look like below.
 message PreemptionContractProto {
   repeated PreemptionResourceRequestProto resource = 1;
   repeated PreemptionContainerProto container = 2;
+  optional int64 timeout = 3;
message PreemptionContainerProto {
  optional ContainerIdProto id = 1;

I have added {{timeout}} per message level. I can try attaching it per 
container level as an optional parameter. One potential bottleneck is, 
different preemption events(ProportionalCPP, Decommission etc) can come to 
Application at different time. And {{allocate}} call from 
ApplicationMasterService may hit after some secs to fetch "to be preempted" 
containers. Hence there can be some elapsed time already lost for few 
containers. We can subtract and then send to AM, but will it overload scheduler 
if many containers are marked for preemption (storing last update time per 
container level)?

> Indicate preemption timout along with the list of containers to AM 
> (preemption message)
> ---------------------------------------------------------------------------------------
>                 Key: YARN-3784
>                 URL: https://issues.apache.org/jira/browse/YARN-3784
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: 0001-YARN-3784.patch
> Currently during preemption, AM is notified with a list of containers which 
> are marked for preemption. Introducing a timeout duration also along with 
> this container list so that AM can know how much time it will get to do a 
> graceful shutdown to its containers (assuming one of preemption policy is 
> loaded in AM).
> This will help in decommissioning NM scenarios, where NM will be 
> decommissioned after a timeout (also killing containers on it). This timeout 
> will be helpful to indicate AM that those containers can be killed by RM 
> forcefully after the timeout.

This message was sent by Atlassian JIRA

Reply via email to