hex108 updated YARN-2612:
    Attachment: YARN-2612.patch

> Some completed containers are not reported to NM
> ------------------------------------------------
>                 Key: YARN-2612
>                 URL: https://issues.apache.org/jira/browse/YARN-2612
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.6.0
>            Reporter: hex108
>             Fix For: 2.6.0
>         Attachments: YARN-2612.patch
> In YARN-1372, NM will report completed containers to RM until it gets ACK 
> from RM.  If AM does not call allocate, which means that AM does not ack RM, 
> RM will not ack NM. We have observed these two cases when running Mapreduce 
> task 'pi':
> 1) RM sends completed containers to AM. After receiving it, AM thinks it has 
> done the work and does not need resource, so it does not call allocate.
> 2) When AM finishes, it could not ack to RM because AM itself has not 
> finished yet.
> In order to solve this problem, we have two solutions:
> 1) When RMAppAttempt call FinalTransition, it means AppAttempt finishes, then 
> RM could send this AppAttempt's completed containers to NM.
> 2) In  FairScheduler#nodeUpdate, if completed containers sent by NM does not 
> have corresponding RMContainer, RM just ack it to NM.
> We prefer to solution 2 because it is more clear and concise. However RM 
> might ack same completed containers to NM many times.

This message was sent by Atlassian JIRA

Reply via email to