[ 
https://issues.apache.org/jira/browse/YARN-9608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16862789#comment-16862789
 ] 

Zhankun Tang commented on YARN-9608:
------------------------------------

[~abmodi], Thanks. Just read through the whole patch. Two questions:

1. If there's a long-running Spark shell application A of YARN cluster mode, 
only can the timeout cause the decommissioning node 1 (app A's container ran on 
it previously, but A's AM running on node 2) to shut down, right?

2. And if node 1 is shut down due to timeout, and when node 1 is re-registered 
in the future, will the node 1 still be considered belongs to running 
application A?

> DecommissioningNodesWatcher should get lists of running applications on node 
> from RMNode.
> -----------------------------------------------------------------------------------------
>
>                 Key: YARN-9608
>                 URL: https://issues.apache.org/jira/browse/YARN-9608
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Abhishek Modi
>            Assignee: Abhishek Modi
>            Priority: Major
>         Attachments: YARN-9608.001.patch, YARN-9608.002.patch
>
>
> At present, DecommissioningNodesWatcher tracks list of running applications 
> and triggers decommission of nodes when all the applications that ran on the 
> node completes. This Jira proposes to solve following problem:
>  # DecommissioningNodesWatcher skips tracking application containers on a 
> particular node before the node is in DECOMMISSIONING state. It only tracks 
> containers once the node is in DECOMMISSIONING state. This can lead to 
> shuffle data loss of apps whose containers ran on this node before it was 
> moved to decommissioning state.
>  # It is keeping track of running apps. We can leverage this directly from 
> RMNode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to