[ 
https://issues.apache.org/jira/browse/YARN-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Gong updated YARN-2612:
---------------------------
    Attachment: YARN-2612.2.patch

Also change Capacity and FIFO Scheduler.

> Some completed containers are not reported to NM
> ------------------------------------------------
>
>                 Key: YARN-2612
>                 URL: https://issues.apache.org/jira/browse/YARN-2612
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.6.0
>            Reporter: Jun Gong
>             Fix For: 2.6.0
>
>         Attachments: YARN-2612.2.patch, YARN-2612.patch
>
>
> In YARN-1372, NM will report completed containers to RM until it gets ACK 
> from RM.  If AM does not call allocate, which means that AM does not ack RM, 
> RM will not ack NM. We([~chenchun]) have observed these two cases when 
> running Mapreduce task 'pi':
> 1) RM sends completed containers to AM. After receiving it, AM thinks it has 
> done the work and does not need resource, so it does not call allocate.
> 2) When AM finishes, it could not ack to RM because AM itself has not 
> finished yet.
> In order to solve this problem, we have two solutions:
> 1) When RMAppAttempt call FinalTransition, it means AppAttempt finishes, then 
> RM could send this AppAttempt's completed containers to NM.
> 2) In  FairScheduler#nodeUpdate, if completed containers sent by NM does not 
> have corresponding RMContainer, RM just ack it to NM.
> We prefer to solution 2 because it is more clear and concise. However RM 
> might ack same completed containers to NM many times.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to