zhihai xu commented on YARN-433:

Thanks for the patch [~xgong]! The patch looks good to me except two nits:
# It may be better to use {{containerId}} instead of 
{{remoteContainer.getContainerId()}} as the parameter of 
{{containerAllocationExpirer.unregister}} since they are equivalent.
# Since we do {{containerAllocationExpirer.unregister}} for the completed 
containers in {{RMNodeImpl#handleContainerStatus}} earlier, Maybe we can remove 
the redundant code {{containerAllocationExpirer.unregister}}  at 
{{RMContainerImpl#ContainerFinishedAtAcquiredState}}. After we remove this code 
at {{RMContainerImpl#ContainerFinishedAtAcquiredState}},  
ContainerFinishedAtAcquiredState will be the same as FinishedTransition, So we 
can replace ContainerFinishedAtAcquiredState with FinishedTransition and remove 
    .addTransition(RMContainerState.ACQUIRED, RMContainerState.COMPLETED,
        RMContainerEventType.FINISHED, new FinishedTransition())

> When RM is catching up with node updates then it should not expire acquired 
> containers
> --------------------------------------------------------------------------------------
>                 Key: YARN-433
>                 URL: https://issues.apache.org/jira/browse/YARN-433
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Xuan Gong
>         Attachments: YARN-433.1.patch, YARN-433.2.patch, YARN-433.3.patch
> RM expires containers that are not launched within some time of being 
> allocated. The default is 10mins. When an RM is not keeping up with node 
> updates then it may not be aware of new launched containers. If the expire 
> thread fires for such containers then the RM can expire them even though they 
> may have launched.

This message was sent by Atlassian JIRA

Reply via email to