[ https://issues.apache.org/jira/browse/YARN-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15487889#comment-15487889 ]
Allen Wittenauer commented on YARN-5635: ---------------------------------------- bq. does that hold true for even making it an option via a configuration setting? Yes. I don't know how many ways I can tell you that depending upon on an error code here is extremely dangerous and has proven to be unreliable due to the constantly shifting nature of the state of the node on busy clusters. Throw in all of this "magically expanding/shrinking" task resource management bits that have gone in, and the situation gets even worse. Besides, if you REALLY REALLY REALLY want to do this, all you need to do is wrap your existing health check in something else that, upon failure, prints the ERROR message. > Better handling when bad script is configured as Node's HealthScript > -------------------------------------------------------------------- > > Key: YARN-5635 > URL: https://issues.apache.org/jira/browse/YARN-5635 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Allen Wittenauer > Assignee: Yufei Gu > > Earlier fix to YARN-5567 is reverted because its not ideal to get the whole > cluster down because of a bad script. At the same time its important to > report that script is erroneous which is configured as node health script as > it might miss to detect bad health of a node. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org