Eric Badger commented on YARN-9809:

Patch 001 adds the feature but makes it opt-in via the config 
{{yarn.nodemanager.health-checker.run-before-startup}}. I didn't put in the 
retries flag for shutting down the NM if there are a certain number of 
failures. I can do that in a subsequent patch if you'd like. But I tested this 
patch out and it seems to work.

> NMs should supply a health status when registering with RM
> ----------------------------------------------------------
>                 Key: YARN-9809
>                 URL: https://issues.apache.org/jira/browse/YARN-9809
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Eric Badger
>            Assignee: Eric Badger
>            Priority: Major
>         Attachments: YARN-9809.001.patch
> Currently if the NM registers with the RM and it is unhealthy, it can be 
> scheduled many containers before the first heartbeat. After the first 
> heartbeat, the RM will mark the NM as unhealthy and kill all of the 
> containers.

This message was sent by Atlassian Jira

To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to