[ 
https://issues.apache.org/jira/browse/YARN-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17139451#comment-17139451
 ] 

Jim Brennan commented on YARN-9809:
-----------------------------------

[~eyang], [~ebadger] changing the behavior of health-check scripts seems pretty 
dangerous.  We looked into this issue a few years ago, because we had some 
cases where the health-check scripts were not installed properly, and some bad 
nodes were erroneously reporting healthy status.

Rather than try to change the contract for how health-check scripts behave, 
which has been around for a very long time, we instead added a wrapper script 
that we ship with hadoop.  The wrapper checks that the real health-check script 
exists and is executable, and if it's not, it prints an "ERROR" message so the 
NM will mark the node unhealthy.  If the health-check script is good, we just 
exec it.

I agree that changing the handling of health check script output/return value 
is beyond the scope of this Jira.


> NMs should supply a health status when registering with RM
> ----------------------------------------------------------
>
>                 Key: YARN-9809
>                 URL: https://issues.apache.org/jira/browse/YARN-9809
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Eric Badger
>            Assignee: Eric Badger
>            Priority: Major
>         Attachments: YARN-9809.001.patch, YARN-9809.002.patch, 
> YARN-9809.003.patch, YARN-9809.004.patch
>
>
> Currently if the NM registers with the RM and it is unhealthy, it can be 
> scheduled many containers before the first heartbeat. After the first 
> heartbeat, the RM will mark the NM as unhealthy and kill all of the 
> containers.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to