[
https://issues.apache.org/jira/browse/YARN-8122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16446507#comment-16446507
]
Gour Saha commented on YARN-8122:
---------------------------------
bq. I see that containers are getting relaunched and the relaunching at a
steady rate. When the calculation happens, it doesn't take into account of how
many container has failed and retried during the
container-health-threshold.window.
[~eyang], Component health here is the overall health during the window. Let's
say your health threshold is 80% and you request for 10 containers. If 9
containers are successfully running at all times, but 1 container is struggling
to come up because it is failing multiple times on the same node and/or across
multiple nodes, your component as a whole was healthy all the time, since 90%
of them were running.
> Component health threshold monitor
> ----------------------------------
>
> Key: YARN-8122
> URL: https://issues.apache.org/jira/browse/YARN-8122
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Gour Saha
> Assignee: Gour Saha
> Priority: Major
> Attachments: YARN-8122.001.patch, YARN-8122.002.patch,
> YARN-8122.003.patch, YARN-8122.004.patch, YARN-8122.draft.patch
>
>
> Slider supported component health threshold monitoring with SLIDER-1246. It
> would be good to have this feature for YARN Service too.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]