[ 
https://issues.apache.org/jira/browse/YARN-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16411077#comment-16411077
 ] 

Bilwa S T commented on YARN-8065:
---------------------------------

Black Listing of nodes are not happening in the following scenarios 
 # RMAppattempt is in ALLOCATED and LAUNCH_FAILED event comes when NM is down.
 # RMAppattempt is in LAUNCHED nad EXPIRE event comes when NM is down.

In both these cases AppAttempt goes to FINAL_SAVING and eventually to FINAL 
state before CONTAINER_FINISHED event is triggered by RMContainerImpl and in 
the FINAL state CONTAINER_FINISHED event is ignored.

> Application is failing when AM node is stopped
> ----------------------------------------------
>
>                 Key: YARN-8065
>                 URL: https://issues.apache.org/jira/browse/YARN-8065
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Bilwa S T
>            Assignee: Bilwa S T
>            Priority: Major
>             Fix For: 3.0.0
>
>
> Configure yarn.scheduler.capacity.schedule-asynchronously.enable as *true* 
> and 
> yarn.resourcemanager.nodemanagers.heartbeat-interval-ms as *60000* .Run 
> application and make *AM node* down. Application will fail.
> If same node is picked up to launch AM attempt again, application can 
> fail.More likely to occur with lesser number of nodes.
>  
>  
>  
>  
>  
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to