[ 
https://issues.apache.org/jira/browse/YARN-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711415#comment-14711415
 ] 

Varun Saxena commented on YARN-4079:
------------------------------------

cc [~djp]

> Retrospect on the decision of making yarn.dispatcher.exit-on-error as true 
> explicitly in daemons
> ------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4079
>                 URL: https://issues.apache.org/jira/browse/YARN-4079
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn
>    Affects Versions: 2.7.1
>            Reporter: Varun Saxena
>            Assignee: Varun Saxena
>
> Currently in all daemons this config is explicitly set to true so that 
> daemons can crash instead of hanging around. While this seems to be correct, 
> as a  recoverable exception should be caught and handled and NOT leaked 
> through to AsyncDispatcher. And a non recoverable one should lead to a crash 
> anyways.
> But this can make system more fragile in case we miss to catch all 
> recoverable exceptions.
> Currently we do not even have an option of setting it to false in 
> configuration, even if we would want. 
> Probably we can read this value from configuration and set it to true in 
> daemons if not configured.
> This way in production clusters if there is an exception which is leading to 
> the daemon crashing frequently and we find that its unavoidable but not a 
> very big issue(i.e daemon can still work normally for most part), we can 
> atleast set the configuration to false in config file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to