[
https://issues.apache.org/jira/browse/YARN-1183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802258#comment-13802258
]
Andrey Klochkov commented on YARN-1183:
---------------------------------------
Jonathan, the issue occurred when I just run tests for
hadoop-mapreduce-client-jobclient and watched for zombie Java processes. It was
much more visible when using parallel execution, see MAPREDUCE-4980. I observed
it quite often under OSX (some of the tests did that on every run) and didn't
see it on a Linux machine I had, and I had different JVM's there. I reproduced
it later in a non-modified trunk and tracked it down to MiniYARNCluster
shutdown. Can't reproduce it on another macbook I have now, but I think this
just due to the nature of the bug (concurrency issue).
> MiniYARNCluster shutdown takes several minutes intermittently
> -------------------------------------------------------------
>
> Key: YARN-1183
> URL: https://issues.apache.org/jira/browse/YARN-1183
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Andrey Klochkov
> Assignee: Andrey Klochkov
> Attachments: YARN-1183--n2.patch, YARN-1183--n3.patch,
> YARN-1183--n4.patch, YARN-1183--n5.patch, YARN-1183.patch
>
>
> As described in MAPREDUCE-5501 sometimes M/R tests leave MRAppMaster java
> processes living for several minutes after successful completion of the
> corresponding test. There is a concurrency issue in MiniYARNCluster shutdown
> logic which leads to this. Sometimes RM stops before an app master sends it's
> last report, and then the app master keeps retrying for >6 minutes. In some
> cases it leads to failures in subsequent tests, and it affects performance of
> tests as app masters eat resources.
--
This message was sent by Atlassian JIRA
(v6.1#6144)