[ 
https://issues.apache.org/jira/browse/YARN-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828287#comment-13828287
 ] 

Omkar Vinit Joshi commented on YARN-1416:
-----------------------------------------

Thanks [~jianhe]

I have basic question.. RM should have crashed right? we can't just ignore such 
invalid state transitions? Should we? I see that someone has modified it to log 
the exception but ignore it inside RMAppImpl.java. 
{code}
      try {
        /* keep the master in sync with the state machine */
        this.stateMachine.doTransition(event.getType(), event);
      } catch (InvalidStateTransitonException e) {
        LOG.error("Can't handle this event at current state", e);
        /* TODO fail the application on the failed transition */
      }
{code}
I see that other places too we are ignoring this after logging it. Not sure if 
this is right because we may just move the system into corrupted state without 
crashing/stopping it. At least we should add assert statements to all the state 
machines to make sure that such transitions don't go unnoticed.

I applied the patch and tested locally.. one more test needs to be fixed..
{code}
2013-11-20 15:23:52,127 INFO  [AsyncDispatcher event handler] 
attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(645)) - 
appattempt_1384989831257_0042_000001 State change from NEW to SUBMITTED
2013-11-20 15:23:52,129 ERROR [AsyncDispatcher event handler] rmapp.RMAppImpl 
(RMAppImpl.java:handle(593)) - Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
APP_ACCEPTED at RUNNING
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:591)
        at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.handle(RMAppImpl.java:77)
        at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions$TestApplicationEventDispatcher.handle(TestRMAppTransitions.java:139)
        at 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.TestRMAppTransitions$TestApplicationEventDispatcher.handle(TestRMAppTransitions.java:125)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:159)
        at 
org.apache.hadoop.yarn.event.DrainDispatcher$1.run(DrainDispatcher.java:65)
        at java.lang.Thread.run(Thread.java:680)
{code}



> InvalidStateTransitions getting reported in multiple test cases even though 
> they pass
> -------------------------------------------------------------------------------------
>
>                 Key: YARN-1416
>                 URL: https://issues.apache.org/jira/browse/YARN-1416
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Omkar Vinit Joshi
>            Assignee: Jian He
>         Attachments: YARN-1416.1.patch, YARN-1416.1.patch
>
>
> It might be worth checking why they are reporting this.
> Testcase : TestRMAppTransitions, TestRM
> there are large number of such errors.
> can't handle RMAppEventType.APP_UPDATE_SAVED at RMAppState.FAILED



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to