[ 
https://issues.apache.org/jira/browse/YARN-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15826911#comment-15826911
 ] 

Yufei Gu commented on YARN-6061:
--------------------------------

Thanks [~kasha] for the review. Misunderstanding may be on my side. I was 
trying to create terminate-program version of class 
{{YarnUncaughtExceptionHandler}} in last patch. If we narrow down to RM-wide, I 
agreed with you, to send a RMFatalEvent seems more graceful. But to do that, we 
need to modify the handler of RMFatalEvent as well because the current handler 
just terminates the program as the following code.
{code}
  public static class RMFatalEventDispatcher
      implements EventHandler<RMFatalEvent> {

    @Override
    public void handle(RMFatalEvent event) {
      LOG.fatal("Received a " + RMFatalEvent.class.getName() + " of type " +
          event.getType().name() + ". Cause:\n" + event.getCause());

      ExitUtil.terminate(1, event.getCause());
    }
  }
{code}

> Add a customized uncaughtexceptionhandler for critical threads
> --------------------------------------------------------------
>
>                 Key: YARN-6061
>                 URL: https://issues.apache.org/jira/browse/YARN-6061
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: yarn
>            Reporter: Yufei Gu
>            Assignee: Yufei Gu
>         Attachments: YARN-6061.001.patch
>
>
> There are several threads in fair scheduler. The thread will quit when there 
> is a runtime exception inside it. We should bring down the RM when that 
> happens. Otherwise, there may be some weird behavior in RM. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to