[
https://issues.apache.org/jira/browse/YARN-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711414#comment-14711414
]
Varun Saxena commented on YARN-4079:
------------------------------------
This JIRA has been raised based on the discussion on YARN-3011
(https://issues.apache.org/jira/browse/YARN-3011?focusedCommentId=14710000&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14710000).
We can probably decide here if we want to handle it as above or not.
> Retrospect on the decision of making yarn.dispatcher.exit-on-error as true
> explicitly in daemons
> ------------------------------------------------------------------------------------------------
>
> Key: YARN-4079
> URL: https://issues.apache.org/jira/browse/YARN-4079
> Project: Hadoop YARN
> Issue Type: Bug
> Components: yarn
> Affects Versions: 2.7.1
> Reporter: Varun Saxena
> Assignee: Varun Saxena
>
> Currently in all daemons this config is explicitly set to true so that
> daemons can crash instead of hanging around. While this seems to be correct,
> as a recoverable exception should be caught and handled and NOT leaked
> through to AsyncDispatcher. And a non recoverable one should lead to a crash
> anyways.
> But this can make system more fragile in case we miss to catch all
> recoverable exceptions.
> Currently we do not even have an option of setting it to false in
> configuration, even if we would want.
> Probably we can read this value from configuration and set it to true in
> daemons if not configured.
> This way in production clusters if there is an exception which is leading to
> the daemon crashing frequently and we find that its unavoidable but not a
> very big issue(i.e daemon can still work normally for most part), we can
> atleast set the configuration to false in config file.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)