[ https://issues.apache.org/jira/browse/YARN-261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrey Klochkov updated YARN-261: --------------------------------- Attachment: YARN-261--n2.patch Updated the patch to prevent the exception in the log when processing RMAppEventType.ATTEMPT_KILLED. On failing an attempt instead of additional logic - well, the patch itself consists of: 1) CLI modifications 2) adding the API call and lots of boilerplate needed for that 3) replicating AttemptFailedTransaction into a modified version named AppRestartedTransaction (with different diagnostics) 4) modifying state machines. The 4th part is needed exactly for not firing RMAppEventType.ATTEMPT_KILLED. I missed that part in the prev patch. > Ability to kill AM attempts > --------------------------- > > Key: YARN-261 > URL: https://issues.apache.org/jira/browse/YARN-261 > Project: Hadoop YARN > Issue Type: New Feature > Components: api > Affects Versions: 2.0.3-alpha > Reporter: Jason Lowe > Attachments: YARN-261--n2.patch, YARN-261.patch > > > It would be nice if clients could ask for an AM attempt to be killed. This > is analogous to the task attempt kill support provided by MapReduce. > This feature would be useful in a scenario where AM retries are enabled, the > AM supports recovery, and a particular AM attempt is stuck. Currently if > this occurs the user's only recourse is to kill the entire application, > requiring them to resubmit a new application and potentially breaking > downstream dependent jobs if it's part of a bigger workflow. Killing the > attempt would allow a new attempt to be started by the RM without killing the > entire application, and if the AM supports recovery it could potentially save > a lot of work. It could also be useful in workflow scenarios where the > failure of the entire application kills the workflow, but the ability to kill > an attempt can keep the workflow going if the subsequent attempt succeeds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira