[ 
https://issues.apache.org/jira/browse/YARN-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16446507#comment-16446507
 ] 

Gour Saha commented on YARN-8122:
---------------------------------

bq. I see that containers are getting relaunched and the relaunching at a 
steady rate. When the calculation happens, it doesn't take into account of how 
many container has failed and retried during the 
container-health-threshold.window.
[~eyang], Component health here is the overall health during the window. Let's 
say your health threshold is 80% and you request for 10 containers. If 9 
containers are successfully running at all times, but 1 container is struggling 
to come up because it is failing multiple times on the same node and/or across 
multiple nodes, your component as a whole was healthy all the time, since 90% 
of them were running.

> Component health threshold monitor
> ----------------------------------
>
>                 Key: YARN-8122
>                 URL: https://issues.apache.org/jira/browse/YARN-8122
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Gour Saha
>            Assignee: Gour Saha
>            Priority: Major
>         Attachments: YARN-8122.001.patch, YARN-8122.002.patch, 
> YARN-8122.003.patch, YARN-8122.004.patch, YARN-8122.draft.patch
>
>
> Slider supported component health threshold monitoring with SLIDER-1246. It 
> would be good to have this feature for YARN Service too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to