[ https://issues.apache.org/jira/browse/YARN-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Karthik Kambatla updated YARN-2010: ----------------------------------- Attachment: yarn-2010-5.patch Updated patch to fix test failure, findbugs warning, and suppress javac warnings (we call getEventHandler().handle() at several other places, I don't quite get why it leads to a javac warning only here). > If RM fails to recover an app, it can never transition to active again > ---------------------------------------------------------------------- > > Key: YARN-2010 > URL: https://issues.apache.org/jira/browse/YARN-2010 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Affects Versions: 2.3.0 > Reporter: bc Wong > Assignee: Karthik Kambatla > Priority: Critical > Attachments: YARN-2010.1.patch, YARN-2010.patch, > issue-stacktrace.rtf, yarn-2010-2.patch, yarn-2010-3.patch, > yarn-2010-3.patch, yarn-2010-4.patch, yarn-2010-5.patch > > > Sometimes, the RM fails to recover an application. It could be because of > turning security on, token expiry, or issues connecting to HDFS etc. The > causes could be classified into (1) transient, (2) specific to one > application, and (3) permanent and apply to multiple (all) applications. > Today, the RM fails to transition to Active and ends up in STOPPED state and > can never be transitioned to Active again. > The initial stacktrace reported is at > https://issues.apache.org/jira/secure/attachment/12676476/issue-stacktrace.rtf -- This message was sent by Atlassian JIRA (v6.3.4#6332)