[
https://issues.apache.org/jira/browse/YARN-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15856952#comment-15856952
]
ASF GitHub Bot commented on YARN-6061:
--------------------------------------
Github user kambatla commented on a diff in the pull request:
https://github.com/apache/hadoop/pull/182#discussion_r99949771
--- Diff:
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ResourceManager.java
---
@@ -819,19 +824,39 @@ public void handle(RMFatalEvent event) {
}
}
- public void handleTransitionToStandBy() {
- if (rmContext.isHAEnabled()) {
- try {
- // Transition to standby and reinit active services
- LOG.info("Transitioning RM to Standby mode");
- transitionToStandby(true);
- EmbeddedElector elector = rmContext.getLeaderElectorService();
- if (elector != null) {
- elector.rejoinElection();
+ /**
+ * Transition to standby in a new thread.
+ */
+ public void handleTransitionToStandByInNewThread() {
+ Thread standByTransitionThread =
+ new Thread(activeServices.standByTransitionRunnable);
+ standByTransitionThread.setName("StandByTransitionThread");
+ standByTransitionThread.start();
+ }
+
+ private class StandByTransitionRunnable implements Runnable {
+ private AtomicBoolean hasRun = new AtomicBoolean(false);
+
+ @Override
+ public void run() {
+ // Prevent from running again if it has run.
--- End diff --
Add more detail here: "Run this only once, even if multiple threads end up
triggering this simultaneously."
> Add a customized uncaughtexceptionhandler for critical threads in RM
> --------------------------------------------------------------------
>
> Key: YARN-6061
> URL: https://issues.apache.org/jira/browse/YARN-6061
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: resourcemanager
> Reporter: Yufei Gu
> Assignee: Yufei Gu
> Attachments: YARN-6061.001.patch, YARN-6061.002.patch,
> YARN-6061.003.patch, YARN-6061.004.patch, YARN-6061.005.patch,
> YARN-6061.006.patch, YARN-6061.007.patch
>
>
> There are several threads in fair scheduler. The thread will quit when there
> is a runtime exception inside it. We should bring down the RM when that
> happens. Otherwise, there may be some weird behavior in RM.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]