[ https://issues.apache.org/jira/browse/YARN-1183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802258#comment-13802258 ]
Andrey Klochkov commented on YARN-1183: --------------------------------------- Jonathan, the issue occurred when I just run tests for hadoop-mapreduce-client-jobclient and watched for zombie Java processes. It was much more visible when using parallel execution, see MAPREDUCE-4980. I observed it quite often under OSX (some of the tests did that on every run) and didn't see it on a Linux machine I had, and I had different JVM's there. I reproduced it later in a non-modified trunk and tracked it down to MiniYARNCluster shutdown. Can't reproduce it on another macbook I have now, but I think this just due to the nature of the bug (concurrency issue). > MiniYARNCluster shutdown takes several minutes intermittently > ------------------------------------------------------------- > > Key: YARN-1183 > URL: https://issues.apache.org/jira/browse/YARN-1183 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Andrey Klochkov > Assignee: Andrey Klochkov > Attachments: YARN-1183--n2.patch, YARN-1183--n3.patch, > YARN-1183--n4.patch, YARN-1183--n5.patch, YARN-1183.patch > > > As described in MAPREDUCE-5501 sometimes M/R tests leave MRAppMaster java > processes living for several minutes after successful completion of the > corresponding test. There is a concurrency issue in MiniYARNCluster shutdown > logic which leads to this. Sometimes RM stops before an app master sends it's > last report, and then the app master keeps retrying for >6 minutes. In some > cases it leads to failures in subsequent tests, and it affects performance of > tests as app masters eat resources. -- This message was sent by Atlassian JIRA (v6.1#6144)