Eric Badger created YARN-7114:
---------------------------------

             Summary: NM can fail during shutdown with log aggregation
                 Key: YARN-7114
                 URL: https://issues.apache.org/jira/browse/YARN-7114
             Project: Hadoop YARN
          Issue Type: Bug
    Affects Versions: 2.8.1
            Reporter: Eric Badger


{noformat}
2017-08-24 16:36:35,961 [AsyncDispatcher event handler] WARN 
application.ApplicationImpl: Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
APPLICATION_LOG_HANDLING_FINISHED at FINISHING_CONTAINERS_WAIT
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:458)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:63)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:1314)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:1306)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
        at java.lang.Thread.run(Thread.java:745)
2017-08-24 16:36:35,962 [AsyncDispatcher event handler] INFO 
application.ApplicationImpl: Application application_1502220952225_46598 
transitioned from FINISHING_CONTAINERS_WAIT to null
2017-08-24 16:36:36,056 [AsyncDispatcher event handler] WARN 
application.ApplicationImpl: Can't handle this event at current state
org.apache.hadoop.yarn.state.InvalidStateTransitionException: Invalid event: 
APPLICATION_LOG_HANDLING_FINISHED at FINISHING_CONTAINERS_WAIT
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
        at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:458)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl.handle(ApplicationImpl.java:63)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:1314)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ApplicationEventDispatcher.handle(ContainerManagerImpl.java:1306)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
        at 
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110)
        at java.lang.Thread.run(Thread.java:745)
{noformat}

This was caused by doing an RM restart that increased its version. The NM's 
version was unchanged and so it was kicked out of the cluster during 
registration. The NM then did log aggregation and failed when it finished, 
since log aggregation was never called for (it was forced by the shutdown). The 
failure was seen in 2.8, but I believe that this problem also exists in 2.9 and 
trunk



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to