[ 
https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588818#comment-13588818
 ] 

Bikas Saha commented on YARN-365:
---------------------------------

Do we need to worry about there being overlap between the 2 lists. i.e. a 
newlyLaunchedContainer also got completed by the time the slow RM handled the 
NM updates?
{code}
+  private synchronized void nodeUpdate(RMNode nm) {
     if (LOG.isDebugEnabled()) {
       LOG.debug("nodeUpdate: " + nm + " clusterResources: " + clusterResource);
     }
-                  
-    FiCaSchedulerNode node = getNode(nm.getNodeID());
 
+    FiCaSchedulerNode node = getNode(nm.getNodeID());
+    List<UpdatedContainerInfo> containerInfoList = nm.pullContainerUpdates();
+    List<ContainerStatus> newlyLaunchedContainers = new 
ArrayList<ContainerStatus>();
+    List<ContainerStatus> completedContainers = new 
ArrayList<ContainerStatus>();
+    for(UpdatedContainerInfo containerInfo : containerInfoList) {
+      
newlyLaunchedContainers.addAll(containerInfo.getNewlyLaunchedContainers());
+      completedContainers.addAll(containerInfo.getCompletedContainers());
+    }
+    
{code}
Note than this problem (if it is a problem) exists regardless of this change 
because a container may start and complete within the NM heartbeat interval. 
However, chances of hitting it are low before this change because the heartbeat 
interval is short and so the RM never see a node update in which the same 
container both launches and completes. After this change, with a slow RM, this 
can easily happen, specially because we are simply concatenating both sub-lists.
                
> Each NM heartbeat should not generate an event for the Scheduler
> ----------------------------------------------------------------
>
>                 Key: YARN-365
>                 URL: https://issues.apache.org/jira/browse/YARN-365
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager, scheduler
>    Affects Versions: 0.23.5
>            Reporter: Siddharth Seth
>            Assignee: Xuan Gong
>             Fix For: 2.0.4-beta
>
>         Attachments: Prototype2.txt, Prototype3.txt, YARN-365.10.patch, 
> YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, 
> YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch, 
> YARN-365.9.patch
>
>
> Follow up from YARN-275
> https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to