[ 
https://issues.apache.org/jira/browse/YARN-3194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14330150#comment-14330150
 ] 

Hudson commented on YARN-3194:
------------------------------

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #111 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/111/])
YARN-3194. RM should handle NMContainerStatuses sent by NM while registering if 
NM is Reconnected node. Contributed by Rohith (jlowe: rev 
a64dd3d24bfcb9af21eb63869924f6482b147fd3)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeReconnectEvent.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationCleanup.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMNodeTransitions.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceTrackerService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/rmnode/RMNodeImpl.java
* hadoop-yarn-project/CHANGES.txt


> RM should handle NMContainerStatuses sent by NM while registering if NM is 
> Reconnected node
> -------------------------------------------------------------------------------------------
>
>                 Key: YARN-3194
>                 URL: https://issues.apache.org/jira/browse/YARN-3194
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.7.0
>         Environment: NM restart is enabled
>            Reporter: Rohith
>            Assignee: Rohith
>            Priority: Blocker
>             Fix For: 2.7.0
>
>         Attachments: 0001-YARN-3194.patch, 0001-yarn-3194-v1.patch
>
>
> On NM restart ,NM sends all the outstanding NMContainerStatus to RM during 
> registration. The registration can be treated by RM as New node or 
> Reconnecting node. RM triggers corresponding event on the basis of node added 
> or node reconnected state. 
> # Node added event : Again here 2 scenario's can occur 
> ## New node is registering with different ip:port – NOT A PROBLEM
> ## Old node is re-registering because of RESYNC command from RM when RM 
> restart – NOT A PROBLEM
> # Node reconnected event : 
> ## Existing node is re-registering i.e RM treat it as reconnecting node when 
> RM is not restarted 
> ### NM RESTART NOT Enabled – NOT A PROBLEM
> ### NM RESTART is Enabled 
> #### Some applications are running on this node – *Problem is here*
> #### Zero applications are running on this node – NOT A PROBLEM
> Since NMContainerStatus are not handled, RM never get to know about 
> completedContainer and never release resource held be containers. RM will not 
> allocate new containers for pending resource request as long as the 
> completedContainer event is triggered. This results in applications to wait 
> indefinitly because of pending containers are not served by RM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to