[
https://issues.apache.org/jira/browse/YARN-2630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14155227#comment-14155227
]
Zhijie Shen commented on YARN-2630:
-----------------------------------
Would you please check "finishedContainersPulledByAM" is completely replaced in
the code base?
{code}
- if (this.finishedContainersPulledByAM != null) {
+ if (this.containersToBeRemovedFromNM != null) {
addFinishedContainersPulledByAMToProto();
}
{code}
{code}
- public void addFinishedContainersPulledByAM(
+ public void addContainersToBeRemovedFromNM(
final List<ContainerId> finishedContainersPulledByAM) {
if (finishedContainersPulledByAM == null)
return;
initFinishedContainersPulledByAM();
- this.finishedContainersPulledByAM.addAll(finishedContainersPulledByAM);
+ this.containersToBeRemovedFromNM.addAll(finishedContainersPulledByAM);
{code}
{code}
- nhResponse.addFinishedContainersPulledByAM(finishedContainersPulledByAM);
+ nhResponse.addContainersToBeRemovedFromNM(finishedContainersPulledByAM);
{code}
{code}
- response.addFinishedContainersPulledByAM(
+ response.addContainersToBeRemovedFromNM(
new ArrayList<ContainerId>(this.finishedContainersPulledByAM));
{code}
> TestDistributedShell#testDSRestartWithPreviousRunningContainers fails
> ---------------------------------------------------------------------
>
> Key: YARN-2630
> URL: https://issues.apache.org/jira/browse/YARN-2630
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Jian He
> Assignee: Jian He
> Attachments: YARN-2630.1.patch, YARN-2630.2.patch, YARN-2630.3.patch
>
>
> The problem is that after YARN-1372, in work-preserving AM restart, the
> re-launched AM will also receive previously failed AM container. But
> DistributedShell logic is not expecting this extra completed container.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)