[ https://issues.apache.org/jira/browse/YARN-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706432#comment-13706432 ]
Zhijie Shen commented on YARN-292: ---------------------------------- {code} // Acquire the AM container from the scheduler. Allocation amContainerAllocation = appAttempt.scheduler.allocate( appAttempt.applicationAttemptId, EMPTY_CONTAINER_REQUEST_LIST, EMPTY_CONTAINER_RELEASE_LIST, null, null); {code} The above code will eventually pull the newly allocated containers in newlyAllocatedContainers. Logically, AMContainerAllocatedTransition happens after RMAppAttempt receives CONTAINER_ALLOCATED. CONTAINER_ALLOCATED is sent during ContainerStartedTransition, when RMContainer is moving from NEW to ALLOCATED. Therefore, pulling newlyAllocatedContainers happens when RMContainer is at ALLOCATED. In contrast, RMContainer is added to newlyAllocatedContainers when it is still at NEW. In conclusion, one container in the allocation is expected in AMContainerAllocatedTransition. Hinted by [~nemon], the problem may happen at {code} FiCaSchedulerApp application = getApplication(applicationAttemptId); if (application == null) { LOG.error("Calling allocate on removed " + "or non existant application " + applicationAttemptId); return EMPTY_ALLOCATION; } {code} EMPTY_ALLOCATION has 0 container. Another observation is that there seems to be inconsistent synchronization on accessing the application map. Suddenly be aware that [~djp] has started working on this problem. Please feel free to take it over. Thanks! > ResourceManager throws ArrayIndexOutOfBoundsException while handling > CONTAINER_ALLOCATED for application attempt > ---------------------------------------------------------------------------------------------------------------- > > Key: YARN-292 > URL: https://issues.apache.org/jira/browse/YARN-292 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Affects Versions: 2.0.1-alpha > Reporter: Devaraj K > Assignee: Zhijie Shen > > {code:xml} > 2012-12-26 08:41:15,030 ERROR > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler: > Calling allocate on removed or non existant application > appattempt_1356385141279_49525_000001 > 2012-12-26 08:41:15,031 ERROR > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in > handling event type CONTAINER_ALLOCATED for applicationAttempt > application_1356385141279_49525 > java.lang.ArrayIndexOutOfBoundsException: 0 > at java.util.Arrays$ArrayList.get(Arrays.java:3381) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:655) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:644) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:357) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:490) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:80) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:433) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:414) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75) > at java.lang.Thread.run(Thread.java:662) > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira