[ 
https://issues.apache.org/jira/browse/YARN-6715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Bacsko updated YARN-6715:
-------------------------------
    Description: 
NodeHealthScriptRunner does *not* report a bad health if the script exits with 
an exit code other than 0. Look at the {{FAILED_WITH_EXIT_CODE}} case:

{noformat}
    void reportHealthStatus(HealthCheckerExitStatus status) {
      long now = System.currentTimeMillis();
      switch (status) {
      case SUCCESS:
        setHealthStatus(true, "", now);
        break;
      case TIMED_OUT:
        setHealthStatus(false, NODE_HEALTH_SCRIPT_TIMED_OUT_MSG);
        break;
      case FAILED_WITH_EXCEPTION:
        setHealthStatus(false, exceptionStackTrace);
        break;
      case FAILED_WITH_EXIT_CODE:
        setHealthStatus(true, "", now);
        break;
      case FAILED:
        setHealthStatus(false, shexec.getOutput());
        break;
      }
    }
{noformat}

Based on the discussion in YARN-5567, this is intional, but conflicts with the 
upstream document, which says: 

This case also lacks unit test coverage.

  was:
There is a bug in NodeHealthScriptRunner. The {{FAILED_WITH_EXIT_CODE}} case is 
incorrect:

{noformat}
    void reportHealthStatus(HealthCheckerExitStatus status) {
      long now = System.currentTimeMillis();
      switch (status) {
      case SUCCESS:
        setHealthStatus(true, "", now);
        break;
      case TIMED_OUT:
        setHealthStatus(false, NODE_HEALTH_SCRIPT_TIMED_OUT_MSG);
        break;
      case FAILED_WITH_EXCEPTION:
        setHealthStatus(false, exceptionStackTrace);
        break;
      case FAILED_WITH_EXIT_CODE:
        setHealthStatus(true, "", now);
        break;
      case FAILED:
        setHealthStatus(false, shexec.getOutput());
        break;
      }
    }
{noformat}

This case also lacks unit test coverage.


> Fix documentation about NodeHealthScriptRunner 
> -----------------------------------------------
>
>                 Key: YARN-6715
>                 URL: https://issues.apache.org/jira/browse/YARN-6715
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>            Reporter: Peter Bacsko
>            Assignee: Peter Bacsko
>            Priority: Major
>
> NodeHealthScriptRunner does *not* report a bad health if the script exits 
> with an exit code other than 0. Look at the {{FAILED_WITH_EXIT_CODE}} case:
> {noformat}
>     void reportHealthStatus(HealthCheckerExitStatus status) {
>       long now = System.currentTimeMillis();
>       switch (status) {
>       case SUCCESS:
>         setHealthStatus(true, "", now);
>         break;
>       case TIMED_OUT:
>         setHealthStatus(false, NODE_HEALTH_SCRIPT_TIMED_OUT_MSG);
>         break;
>       case FAILED_WITH_EXCEPTION:
>         setHealthStatus(false, exceptionStackTrace);
>         break;
>       case FAILED_WITH_EXIT_CODE:
>         setHealthStatus(true, "", now);
>         break;
>       case FAILED:
>         setHealthStatus(false, shexec.getOutput());
>         break;
>       }
>     }
> {noformat}
> Based on the discussion in YARN-5567, this is intional, but conflicts with 
> the upstream document, which says: 
> This case also lacks unit test coverage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to