[ 
https://issues.apache.org/jira/browse/YARN-2416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14096250#comment-14096250
 ] 

Jian Fang commented on YARN-2416:
---------------------------------

Here is the log for the events and state transitions.

2014-07-05 08:57:41,748 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
container_1404549222428_0001_01_000001 Container Transitioned from NEW to 
ALLOCATED

2014-07-05 08:57:41,760 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
container_1404549222428_0001_01_000001 Container Transitioned from ALLOCATED to 
ACQUIRED

2014-07-05 08:57:41,833 INFO 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to 
launch container container_1404549222428_0001_01_000001 : $JAVA_HOME/bin/java 
-Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.container.log.dir=<LOG_DIR> -Dyarn.app.container.log.filesize=0 
-Dhadoop.root.logger=INFO,CLA -Xmx2048m 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout 
2><LOG_DIR>/stderr


2014-07-05 08:57:42,737 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
container_1404549222428_0001_01_000001 Container Transitioned from ACQUIRED to 
RUNNING
2014-07-05 08:58:54,290 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
container_1404549222428_0001_01_000001 Container Transitioned from RUNNING to 
KILLED

2014-07-05 08:58:54,290 INFO 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:
 Completed container: container_1404549222428_0001_01_000001 in state: KILLED 
event:KILL

2014-07-05 08:58:54,394 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
appattempt_1404549222428_0001_000002 State change from SCHEDULED to 
ALLOCATED_SAVING
2014-07-05 08:58:54,394 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
appattempt_1404549222428_0001_000002 State change from ALLOCATED_SAVING to 
ALLOCATED
2014-07-05 08:58:54,395 INFO 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Launching 
masterappattempt_1404549222428_0001_000002

2014-07-05 08:58:54,397 INFO 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Command to 
launch container container_1404549222428_0001_02_000001 : $JAVA_HOME/bin/java 
-Dlog4j.configuration=container-log4j.properties 
-Dyarn.app.container.log.dir=<LOG_DIR> -Dyarn.app.container.log.filesize=0 
-Dhadoop.root.logger=INFO,CLA  -Xmx2048m 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout 
2><LOG_DIR>/stderr 

2014-07-05 08:58:55,396 INFO 
org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
container_1404549222428_0001_02_000001 Container Transitioned from ACQUIRED to 
RUNNING

2014-07-05 08:59:05,020 INFO 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService: AM 
registration appattempt_1404549222428_0001_000002
2014-07-05 08:59:05,021 INFO 
org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop   
IP=10.198.10.201        OPERATION=Register App Master   
TARGET=ApplicationMasterService RESULT=SUCCESS  
APPID=application_1404549222428_0001    
APPATTEMPTID=appattempt_1404549222428_0001_000002

2014-07-05 08:59:12,653 INFO 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done 
launching container Container: [ContainerId: 
container_1404549222428_0001_01_000001, NodeId: 
ip-10-198-22-164.us-west-1.compute.internal:9103, NodeHttpAddress: 
ip-10-198-22-164.us-west-1.compute.internal:9035, Resource: <memory:3378, 
vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 
10.198.22.164:9103 }, ] for AM appattempt_1404549222428_0001_000001
2014-07-05 08:59:12,653 INFO 
org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done 
launching container Container: [ContainerId: 
container_1404549222428_0001_02_000001, NodeId: 
ip-10-198-10-201.us-west-1.compute.internal:9103, NodeHttpAddress: 
ip-10-198-10-201.us-west-1.compute.internal:9035, Resource: <memory:3378, 
vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 
10.198.10.201:9103 }, ] for AM appattempt_1404549222428_0001_000002

> InvalidStateTransitonException in ResourceManager if AMLauncher does not 
> receive response for startContainers() call in time
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-2416
>                 URL: https://issues.apache.org/jira/browse/YARN-2416
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.4.0
>            Reporter: Jian Fang
>
> AMLauncher calls startContainers(allRequests) to launch a container for 
> application master. Normally, the call comes back immediately so that the 
> RMAppAttempt changes its state from ALLOCATED to LAUNCHED. 
> However, we do observed that in some cases, the RPC call came back very late 
> but the AM container was already started. Because the RMAppAttempt stuck in 
> ALLOCATED state, once resource manager received the REGISTERED event from the 
> application master, it threw InvalidStateTransitonException as follows.
> 2014-07-05 08:59:05,021 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> REGISTERED at ALLOCATED
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:652)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:752)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:733)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>         at java.lang.Thread.run(Thread.java:744)
> For subsequent STATUS_UPDATE and CONTAINER_ALLOCATED events for this job, 
> resource manager kept throwing InvalidStateTransitonException.
> 2014-07-05 08:59:06,152 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> STATUS_UPDATE at ALLOCATED
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:652)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:752)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:733)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>         at java.lang.Thread.run(Thread.java:744)
> 2014-07-05 08:59:07,779 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1404549222428_0001_02_000002 Container Transitioned from NEW to
>  ALLOCATED
> 2014-07-05 08:59:07,779 ERROR 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: 
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> CONTAINER_ALLOCATED at ALLOCATED
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:652)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:752)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:733)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>         at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>         at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to