[ 
https://issues.apache.org/jira/browse/YARN-1183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802258#comment-13802258
 ] 

Andrey Klochkov commented on YARN-1183:
---------------------------------------

Jonathan, the issue occurred when I just run tests for 
hadoop-mapreduce-client-jobclient and watched for zombie Java processes. It was 
much more visible when using parallel execution, see MAPREDUCE-4980. I observed 
it quite often under OSX (some of the tests did that on every run) and didn't 
see it on a Linux machine I had, and I had different JVM's there. I reproduced 
it later in a non-modified trunk and tracked it down to MiniYARNCluster 
shutdown. Can't reproduce it on another macbook I have now, but I think this 
just due to the nature of the bug (concurrency issue).

> MiniYARNCluster shutdown takes several minutes intermittently
> -------------------------------------------------------------
>
>                 Key: YARN-1183
>                 URL: https://issues.apache.org/jira/browse/YARN-1183
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Andrey Klochkov
>            Assignee: Andrey Klochkov
>         Attachments: YARN-1183--n2.patch, YARN-1183--n3.patch, 
> YARN-1183--n4.patch, YARN-1183--n5.patch, YARN-1183.patch
>
>
> As described in MAPREDUCE-5501 sometimes M/R tests leave MRAppMaster java 
> processes living for several minutes after successful completion of the 
> corresponding test. There is a concurrency issue in MiniYARNCluster shutdown 
> logic which leads to this. Sometimes RM stops before an app master sends it's 
> last report, and then the app master keeps retrying for >6 minutes. In some 
> cases it leads to failures in subsequent tests, and it affects performance of 
> tests as app masters eat resources.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to