[ 
https://issues.apache.org/jira/browse/YARN-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14718157#comment-14718157
 ] 

Robert Metzger commented on YARN-3337:
--------------------------------------

For those looking for very simple YARN chaos monkey which is working similar as 
[~steve_l] described here, I have something here: 
https://github.com/rmetzger/yarn-chaos-monkey
It is not running within the AM.
In order to kill the containers, I'm basically ssh'ing into the remote host and 
kill the process.

Maybe the link is helpful for somebody who immediately needs such a tool.

> Provide YARN chaos monkey
> -------------------------
>
>                 Key: YARN-3337
>                 URL: https://issues.apache.org/jira/browse/YARN-3337
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: test
>    Affects Versions: 2.7.0
>            Reporter: Steve Loughran
>
> To test failure resilience today you either need custom scripts or implement 
> Chaos Monkey-like logic in your application (SLIDER-202). 
> Killing AMs and containers on a schedule & probability is the core activity 
> here, one that could be handled by a CLI App/client lib that does this. 
> # entry point to have a startup delay before acting
> # frequency of chaos wakeup/polling
> # probability to AM failure generation (0-100)
> # probability of non-AM container kill
> # future: other operations



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to