[ https://issues.apache.org/jira/browse/YARN-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14252216#comment-14252216 ]
Hadoop QA commented on YARN-2964: --------------------------------- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12688092/YARN-2964.2.patch against trunk revision 07619aa. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 14 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.yarn.server.resourcemanager.TestRM org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestAllocationFileLoaderService org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart Test results: https://builds.apache.org/job/PreCommit-YARN-Build/6149//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-YARN-Build/6149//artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6149//console This message is automatically generated. > RM prematurely cancels tokens for jobs that submit jobs (oozie) > --------------------------------------------------------------- > > Key: YARN-2964 > URL: https://issues.apache.org/jira/browse/YARN-2964 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Affects Versions: 2.6.0 > Reporter: Daryn Sharp > Assignee: Jian He > Priority: Blocker > Attachments: YARN-2964.1.patch, YARN-2964.2.patch > > > The RM used to globally track the unique set of tokens for all apps. It > remembered the first job that was submitted with the token. The first job > controlled the cancellation of the token. This prevented completion of > sub-jobs from canceling tokens used by the main job. > As of YARN-2704, the RM now tracks tokens on a per-app basis. There is no > notion of the first/main job. This results in sub-jobs canceling tokens and > failing the main job and other sub-jobs. It also appears to schedule > multiple redundant renewals. > The issue is not immediately obvious because the RM will cancel tokens ~10 > min (NM livelyness interval) after log aggregation completes. The result is > an oozie job, ex. pig, that will launch many sub-jobs over time will fail if > any sub-jobs are launched >10 min after any sub-job completes. If all other > sub-jobs complete within that 10 min window, then the issue goes unnoticed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)