[ https://issues.apache.org/jira/browse/YARN-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16444876#comment-16444876 ]
Eric Yang commented on YARN-8122: --------------------------------- [~gsaha] Thank you for the patch. I try to simulate the cluster with bad docker daemon on one of the node manager. I see that containers are getting relaunched and the relaunching at a steady rate. When the calculation happens, it doesn't take into account of how many container has failed and retried during the container-health-threshold.window. The calculation is only base on number of current running containers. Hence, service is reporting healthy instead of unhealthy. I think the more accurate calculation would be health-threshold.percent = completed + running container / total launched container with in health-threshold.window. Another simplified calculation is total failed container / total launched in container-health.threshold.window should be less than 1 - health-threshold.percent. Nginx relies on supervisor to start the processes. It will not work without ENTRY_POINT support. I can not get the example to work. Therefore, I think it would be safer to use centos/httpd-24-centos7 with launch command: /usr/bin/run-httpd in the example. > Component health threshold monitor > ---------------------------------- > > Key: YARN-8122 > URL: https://issues.apache.org/jira/browse/YARN-8122 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Gour Saha > Assignee: Gour Saha > Priority: Major > Attachments: YARN-8122.001.patch, YARN-8122.002.patch, > YARN-8122.003.patch, YARN-8122.004.patch, YARN-8122.draft.patch > > > Slider supported component health threshold monitoring with SLIDER-1246. It > would be good to have this feature for YARN Service too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org