[
https://issues.apache.org/jira/browse/YARN-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wilfred Spiegelenburg reopened YARN-5567:
-----------------------------------------
The script handling had a lot of comments in it why the exit code was ignored
and an exit code that is not zero should not change the health status:
{code}
144 * The node is marked unhealthy if
145 * <ol>
146 * <li>The node health script times out</li>
147 * <li>The node health scripts output has a line which begins with
ERROR</li>
148 * <li>An exception is thrown while executing the script</li>
149 * </ol>
150 * If the script throws {@link IOException} or {@link ExitCodeException}
the
151 * output is ignored and node is left remaining healthy, as script might
152 * have syntax error.
{code}
What we have just done is break all of this. We now do not ignore the exit code
and mark the node as unhealthy. I assume this was originally done for a reason
and we could have just introduced a backwards incompatible behavioural change.
Looking at the underlying ShellCommandExecutor and tracing back to the
{{Shell.runCommnad()}} method: all non zero exit codes will throw a
{{ExitCodeException}}.
If we are going to change the behaviour that is documented we should not do it
in release 2.8.1 and also update all related documentation.
> Fix script exit code checking in NodeHealthScriptRunner#reportHealthStatus
> --------------------------------------------------------------------------
>
> Key: YARN-5567
> URL: https://issues.apache.org/jira/browse/YARN-5567
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 2.8.0, 3.0.0-alpha1
> Reporter: Yufei Gu
> Assignee: Yufei Gu
> Fix For: 2.8.1
>
> Attachments: YARN-5567.001.patch
>
>
> In case of FAILED_WITH_EXIT_CODE, health status should be false.
> {code}
> case FAILED_WITH_EXIT_CODE:
> setHealthStatus(true, "", now);
> break;
> {code}
> should be
> {code}
> case FAILED_WITH_EXIT_CODE:
> setHealthStatus(false, "", now);
> break;
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]