[
https://issues.apache.org/jira/browse/YARN-3038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sangjin Lee updated YARN-3038:
------------------------------
Summary: [Collector wireup] Handle timeline collector failure scenarios
(was: [Aggregator wireup] Handle ATS writer failure scenarios)
> [Collector wireup] Handle timeline collector failure scenarios
> --------------------------------------------------------------
>
> Key: YARN-3038
> URL: https://issues.apache.org/jira/browse/YARN-3038
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: timelineserver
> Reporter: Sangjin Lee
> Assignee: Varun Saxena
>
> Per design in YARN-2928, consider various ATS writer failure scenarios, and
> implement proper handling.
> For example, ATS writers may fail and exit due to OOM. It should be retried a
> certain number of times in that case. We also need to tie fatal ATS writer
> failures (after exhausting all retries) to the application failure, and so on.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)