Steve Loughran created YARN-3337:
------------------------------------
Summary: Provide YARN chaos monkey
Key: YARN-3337
URL: https://issues.apache.org/jira/browse/YARN-3337
Project: Hadoop YARN
Issue Type: New Feature
Components: test
Affects Versions: 2.7.0
Reporter: Steve Loughran
To test failure resilience today you either need custom scripts or implement
Chaos Monkey-like logic in your application (SLIDER-202).
Killing AMs and containers on a schedule & probability is the core activity
here, one that could be handled by a CLI App/client lib that does this.
# entry point to have a startup delay before acting
# frequency of chaos wakeup/polling
# probability to AM failure generation (0-100)
# probability of non-AM container kill
# future: other operations
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)