[
https://issues.apache.org/jira/browse/YARN-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16411077#comment-16411077
]
Bilwa S T commented on YARN-8065:
---------------------------------
Black Listing of nodes are not happening in the following scenarios
# RMAppattempt is in ALLOCATED and LAUNCH_FAILED event comes when NM is down.
# RMAppattempt is in LAUNCHED nad EXPIRE event comes when NM is down.
In both these cases AppAttempt goes to FINAL_SAVING and eventually to FINAL
state before CONTAINER_FINISHED event is triggered by RMContainerImpl and in
the FINAL state CONTAINER_FINISHED event is ignored.
> Application is failing when AM node is stopped
> ----------------------------------------------
>
> Key: YARN-8065
> URL: https://issues.apache.org/jira/browse/YARN-8065
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Bilwa S T
> Assignee: Bilwa S T
> Priority: Major
> Fix For: 3.0.0
>
>
> Configure yarn.scheduler.capacity.schedule-asynchronously.enable as *true*
> and
> yarn.resourcemanager.nodemanagers.heartbeat-interval-ms as *60000* .Run
> application and make *AM node* down. Application will fail.
> If same node is picked up to launch AM attempt again, application can
> fail.More likely to occur with lesser number of nodes.
>
>
>
>
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]