[ https://issues.apache.org/jira/browse/YARN-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Peter Bacsko updated YARN-6715: ------------------------------- Description: NodeHealthScriptRunner does *not* report a bad health if the script exits with an exit code other than 0. Look at the {{FAILED_WITH_EXIT_CODE}} case: {noformat} void reportHealthStatus(HealthCheckerExitStatus status) { long now = System.currentTimeMillis(); switch (status) { case SUCCESS: setHealthStatus(true, "", now); break; case TIMED_OUT: setHealthStatus(false, NODE_HEALTH_SCRIPT_TIMED_OUT_MSG); break; case FAILED_WITH_EXCEPTION: setHealthStatus(false, exceptionStackTrace); break; case FAILED_WITH_EXIT_CODE: setHealthStatus(true, "", now); break; case FAILED: setHealthStatus(false, shexec.getOutput()); break; } } {noformat} Based on the discussion in YARN-5567, this is intional, but conflicts with the upstream document, which says: This case also lacks unit test coverage. was: There is a bug in NodeHealthScriptRunner. The {{FAILED_WITH_EXIT_CODE}} case is incorrect: {noformat} void reportHealthStatus(HealthCheckerExitStatus status) { long now = System.currentTimeMillis(); switch (status) { case SUCCESS: setHealthStatus(true, "", now); break; case TIMED_OUT: setHealthStatus(false, NODE_HEALTH_SCRIPT_TIMED_OUT_MSG); break; case FAILED_WITH_EXCEPTION: setHealthStatus(false, exceptionStackTrace); break; case FAILED_WITH_EXIT_CODE: setHealthStatus(true, "", now); break; case FAILED: setHealthStatus(false, shexec.getOutput()); break; } } {noformat} This case also lacks unit test coverage. > Fix documentation about NodeHealthScriptRunner > ----------------------------------------------- > > Key: YARN-6715 > URL: https://issues.apache.org/jira/browse/YARN-6715 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Reporter: Peter Bacsko > Assignee: Peter Bacsko > Priority: Major > > NodeHealthScriptRunner does *not* report a bad health if the script exits > with an exit code other than 0. Look at the {{FAILED_WITH_EXIT_CODE}} case: > {noformat} > void reportHealthStatus(HealthCheckerExitStatus status) { > long now = System.currentTimeMillis(); > switch (status) { > case SUCCESS: > setHealthStatus(true, "", now); > break; > case TIMED_OUT: > setHealthStatus(false, NODE_HEALTH_SCRIPT_TIMED_OUT_MSG); > break; > case FAILED_WITH_EXCEPTION: > setHealthStatus(false, exceptionStackTrace); > break; > case FAILED_WITH_EXIT_CODE: > setHealthStatus(true, "", now); > break; > case FAILED: > setHealthStatus(false, shexec.getOutput()); > break; > } > } > {noformat} > Based on the discussion in YARN-5567, this is intional, but conflicts with > the upstream document, which says: > This case also lacks unit test coverage. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org