[
https://issues.apache.org/jira/browse/YARN-1933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13967318#comment-13967318
]
Jian He commented on YARN-1933:
-------------------------------
- TestAMRestart:
Removed the following check, because after we send the container complete
event, the containers could be just removed immediately from the
liveContainers inside the schedulerAttempt, which causes NPE
{code}
nm1.nodeHeartbeat(am1.getApplicationAttemptId(), 3,
ContainerState.COMPLETE);
- rm1.waitForState(nm1, containerId3, RMContainerState.COMPLETED);
{code}
Also changed some test logic to wait until the expected number of containers
reached.
- TestNodeHealthService:
Give write and read permission of the script file and also Put the close() in
finally block.
- Minor side fix in ZKRMStateStore.java: moved the error message to debug level
as I found that the createRootDir method will throw NodeAlreadyExistsException
if the root already exits. And it's always the case that the root exits after
RM restarts.
> TestAMRestart and TestNodeHealthService failing sometimes on Windows
> --------------------------------------------------------------------
>
> Key: YARN-1933
> URL: https://issues.apache.org/jira/browse/YARN-1933
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Jian He
> Assignee: Jian He
> Attachments: YARN-1933.1.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.2#6252)