[
https://issues.apache.org/jira/browse/YARN-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828287#comment-13828287
]
Omkar Vinit Joshi commented on YARN-1416:
-----------------------------------------
Thanks [~jianhe]
I have basic question.. RM should have crashed right? we can't just ignore such
invalid state transitions? Should we? I see that someone has modified it to log
the exception but ignore it inside RMAppImpl.java.
{code}
try {
/* keep the master in sync with the state machine */
this.stateMachine.doTransition(event.getType(), event);
} catch (InvalidStateTransitonException e) {
LOG.error("Can't handle this event at current state", e);
/* TODO fail the application on the failed transition */
}
{code}
I see that other places too we are ignoring this after logging it. Not sure if
this is right because we may just move the system into corrupted state without
crashing/stopping it. At least we should add assert statements to all the state
machines to make sure that such transitions don't go unnoticed.
I applied the patch and tested locally.. one more test needs to be fixed..
{code}
2013-11-20 15:23:52,127 INFO [AsyncDispatcher event handler]
attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(645)) -
appattempt_1384989831257_0042_000001 State change from NEW to SUBMITTED
2013-11-20 15:23:52,129 ERROR [AsyncDispatcher event handler] rmapp.RMAppImpl
(RMAppImpl.java:handle(593)) - Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
APP_ACCEPTED at RUNNING
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:591)
at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:77)
at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions$TestApplicationEventDispatcher.handle(TestRMAppTransitions.java:139)
at
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions$TestApplicationEventDispatcher.handle(TestRMAppTransitions.java:125)
at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:159)
at
org.apache.hadoop.yarn.event.DrainDispatcher$1.run(DrainDispatcher.java:65)
at java.lang.Thread.run(Thread.java:680)
{code}
> InvalidStateTransitions getting reported in multiple test cases even though
> they pass
> -------------------------------------------------------------------------------------
>
> Key: YARN-1416
> URL: https://issues.apache.org/jira/browse/YARN-1416
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Omkar Vinit Joshi
> Assignee: Jian He
> Attachments: YARN-1416.1.patch, YARN-1416.1.patch
>
>
> It might be worth checking why they are reporting this.
> Testcase : TestRMAppTransitions, TestRM
> there are large number of such errors.
> can't handle RMAppEventType.APP_UPDATE_SAVED at RMAppState.FAILED
--
This message was sent by Atlassian JIRA
(v6.1#6144)