[ 
https://issues.apache.org/jira/browse/YARN-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061387#comment-14061387
 ] 

Karthik Kambatla commented on YARN-2244:
----------------------------------------

# Can we use {{AbstractYarnScheduler#killOrphanContainerOnNode()}} instead? 
{code}
+      this.rmContext.getDispatcher().getEventHandler()
+          .handle(new RMNodeCleanContainerEvent(node.getNodeID(), 
containerId));
{code}
# Thanks for moving the following to a separate method. IMO, we should clean it 
up more:
{code}
  protected void waitForContainerCleanup(DrainDispatcher dispatcher, MockNM nm,
                                         NodeHeartbeatResponse resp) throws 
Exception {
    int waitCount;
    dispatcher.await();
    List<ContainerId> contsToClean = resp.getContainersToCleanup();
    int cleanedConts = contsToClean.size();
    waitCount = 0;
    while (cleanedConts < 1 && waitCount++ < 200) {
      LOG.info("Waiting to get cleanup events.. cleanedConts: " + cleanedConts);
      Thread.sleep(100);
      resp = nm.nodeHeartbeat(true);
      dispatcher.await();
      contsToClean = resp.getContainersToCleanup();
      cleanedConts += contsToClean.size();
    }
    if (contsToClean.isEmpty()) {
      LOG.error("Failed to get any containers to cleanup");
    } else {
      LOG.info("Got cleanup for " + contsToClean.get(0));
    }
    Assert.assertEquals(1, cleanedConts);
  }
{code}
## One line over 80 chars
## {{int waitCount = 0}} can go on oneline
## Fetching containers to clean and other arithmetic before the while loop can 
be moved into the while loop. cleanedConts can be initialized to zero. I am 
okay with a do-while too. 
## Remove the logging - I am not sure why are we logging that information 200 
times.
## Parametrize the method to also take number of container cleanups to wait for 
and use it everywhere. 

> FairScheduler missing handling of containers for unknown application attempts 
> ------------------------------------------------------------------------------
>
>                 Key: YARN-2244
>                 URL: https://issues.apache.org/jira/browse/YARN-2244
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>            Reporter: Anubhav Dhoot
>            Assignee: Anubhav Dhoot
>            Priority: Critical
>         Attachments: YARN-2224.patch, YARN-2244.001.patch, YARN-2244.002.patch
>
>
> We are missing changes in patch MAPREDUCE-3596 in FairScheduler. Among other 
> fixes that were common across schedulers, there were some scheduler specific 
> fixes added to handle containers for unknown application attempts. Without 
> these fair scheduler simply logs that an unknown container was found and 
> continues to let it run. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to