[ 
https://issues.apache.org/jira/browse/YARN-10341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152241#comment-17152241
 ] 

Eric Yang commented on YARN-10341:
----------------------------------

[~BilwaST] Sorry, I am confused by this ticket and the proposed patch fix to 
the described problem.  
The containers "restart_policy" controls if the container should be restarted 
on the event of failure/killed.  If it was not set, it will always restart.  If 
it was set to "NEVER", it will not restart.  The completion events are 
secondary information to assist to restart the containers or not.  Using return 
or break in onContainerCompleted method, don't make any difference.

Maybe I am missing something, could you give more information on how this patch 
address the observed issue?

> Yarn Service Container Completed event doesn't get processed 
> -------------------------------------------------------------
>
>                 Key: YARN-10341
>                 URL: https://issues.apache.org/jira/browse/YARN-10341
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Bilwa S T
>            Assignee: Bilwa S T
>            Priority: Critical
>         Attachments: YARN-10341.001.patch
>
>
> If there 10 workers running and if containers get killed , after a while we 
> see that there are just 9 workers runnning. This is due to CONTAINER 
> COMPLETED Event is not processed on AM side. 
> Issue is in below code:
> {code:java}
> public void onContainersCompleted(List<ContainerStatus> statuses) {
>       for (ContainerStatus status : statuses) {
>         ContainerId containerId = status.getContainerId();
>         ComponentInstance instance = 
> liveInstances.get(status.getContainerId());
>         if (instance == null) {
>           LOG.warn(
>               "Container {} Completed. No component instance exists. 
> exitStatus={}. diagnostics={} ",
>               containerId, status.getExitStatus(), status.getDiagnostics());
>           return;
>         }
>         ComponentEvent event =
>             new ComponentEvent(instance.getCompName(), CONTAINER_COMPLETED)
>                 .setStatus(status).setInstance(instance)
>                 .setContainerId(containerId);
>         dispatcher.getEventHandler().handle(event);
>       }
> {code}
> If component instance doesnt exist for a container, it doesnt iterate over 
> other containers as its returning from method



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to