[
https://issues.apache.org/jira/browse/YARN-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jun Gong resolved YARN-2612.
----------------------------
Resolution: Duplicate
> Some completed containers are not reported to NM
> ------------------------------------------------
>
> Key: YARN-2612
> URL: https://issues.apache.org/jira/browse/YARN-2612
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 2.6.0
> Reporter: Jun Gong
> Fix For: 2.6.0
>
>
> We are testing RM work preserving restart and found the following logs when
> we ran a simple MapReduce task "PI". Some completed containers which already
> pulled by AM never reported back to NM, so NM continuously report the
> completed containers while AM had finished.
> {code}
> 2014-09-26 17:00:42,228 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
> Null container completed...
> 2014-09-26 17:00:42,228 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
> Null container completed...
> 2014-09-26 17:00:43,230 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
> Null container completed...
> 2014-09-26 17:00:43,230 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
> Null container completed...
> 2014-09-26 17:00:44,233 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
> Null container completed...
> 2014-09-26 17:00:44,233 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler:
> Null container completed...
> {code}
> In YARN-1372, NM will report completed containers to RM until it gets ACK
> from RM. If AM does not call allocate, which means that AM does not ack RM,
> RM will not ack NM. We([~chenchun]) have observed these two cases when
> running Mapreduce task 'pi':
> 1) RM sends completed containers to AM. After receiving it, AM thinks it has
> done the work and does not need resource, so it does not call allocate.
> 2) When AM finishes, it could not ack to RM because AM itself has not
> finished yet.
> We think when RMAppAttempt call BaseFinalTransition, it means AppAttempt
> finishes, then RM could send this AppAttempt's completed containers to NM.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)