[ https://issues.apache.org/jira/browse/YARN-5644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15487612#comment-15487612 ]
Ray Chiang commented on YARN-5644: ---------------------------------- [~aw] [~Naganarasimha], we can continue the discussion here. I did my best to summarize the earlier discussion in the description here. Thanks. > Define exit code for allowing NodeManager health script to mar > -------------------------------------------------------------- > > Key: YARN-5644 > URL: https://issues.apache.org/jira/browse/YARN-5644 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager > Affects Versions: 3.0.0-alpha2 > Reporter: Ray Chiang > Assignee: Yufei Gu > Labels: supportability > > Done as a alternate design to YARN-5567. Define a specific exit code for the > health checker script (property yarn.nodemanager.health-checker.script.path) > that allows the node to be blacklisted. > As discussed in the latter part of YARN-5567, the current design requirements > are: > # Ignore all exit codes from the script > ## _except_ the newly defined error code which will mark the NodeManager as > UNHEALTHY > ## This allows any syntax or functional errors in the script to be ignored > # Upon failure (or multiple recorded failures): > ## Store the status in the metrics2 state on the NodeManager > ## Allow the RM to blacklist the NM or allow the jobs to drain -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org