[
https://issues.apache.org/jira/browse/YARN-7703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312814#comment-16312814
]
lujie edited comment on YARN-7703 at 1/5/18 10:08 AM:
------------------------------------------------------
I have a initial fix idea which need to be review:
While application receive KILL event at NEW state, current code use
AppKilledTransition which ignores storing state. We can use
{code:java}
new FinalSavingTransition(new AppKilledTransition(), RMAppState.KILLED)
{code}
to replace AppKilledTransition and the postState should be changed to
FINAL_SAVING. FinalSavingTransition will tell StateStore to perform store
action. The stateStore will reply APP_UPDATE_SAVED back to application.
In unit test TestRMAppTransitions#testAppNewKill, we only need add a line
{color:#d04437}assertAppState(RMAppState.FINAL_SAVING, application);{color}
before perform sendAppUpdateSavedEvent
i would attach a patch after YARN-7663 fixed, and this patch should fix
another InvalidStateTransitionException(only mark it here).
was (Author: xiaoheipangzi):
I have a initial fix idea which need to be review:
While application receive KILL event at NEW state, current code use
AppKilledTransition which ignores storing state. We can use
{code:java}
new FinalSavingTransition(new AppKilledTransition(), RMAppState.KILLED)
{code}
to replace AppKilledTransition and the postState should be changed to
FINAL_SAVING. FinalSavingTransition will tell StateStore to perform store
action. The stateStore will reply APP_UPDATE_SAVED back to application.
In unit test TestRMAppTransitions#testAppNewKill, we only need add a line
:assertAppState(RMAppState.FINAL_SAVING, application); before
sendAppUpdateSavedEvent
i would attach a patch after YARN-7663 fixed, and this patch should fix
another InvalidStateTransitionException(only mark it here).
> Apps killed from the NEW state are not recorded in the state store
> ------------------------------------------------------------------
>
> Key: YARN-7703
> URL: https://issues.apache.org/jira/browse/YARN-7703
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Reporter: Jason Lowe
> Assignee: lujie
>
> While reviewing YARN-7663 I noticed that apps killed from the NEW state skip
> storing anything to the RM state store. That means upon restart and recovery
> these apps will not be recovered, so they will simply disappear. That could
> be surprising for users.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]