[
https://issues.apache.org/jira/browse/YARN-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706432#comment-13706432
]
Zhijie Shen commented on YARN-292:
----------------------------------
{code}
// Acquire the AM container from the scheduler.
Allocation amContainerAllocation = appAttempt.scheduler.allocate(
appAttempt.applicationAttemptId, EMPTY_CONTAINER_REQUEST_LIST,
EMPTY_CONTAINER_RELEASE_LIST, null, null);
{code}
The above code will eventually pull the newly allocated containers in
newlyAllocatedContainers.
Logically, AMContainerAllocatedTransition happens after RMAppAttempt receives
CONTAINER_ALLOCATED. CONTAINER_ALLOCATED is sent during
ContainerStartedTransition, when RMContainer is moving from NEW to ALLOCATED.
Therefore, pulling newlyAllocatedContainers happens when RMContainer is at
ALLOCATED. In contrast, RMContainer is added to newlyAllocatedContainers when
it is still at NEW. In conclusion, one container in the allocation is expected
in AMContainerAllocatedTransition.
Hinted by [~nemon], the problem may happen at
{code}
FiCaSchedulerApp application = getApplication(applicationAttemptId);
if (application == null) {
LOG.error("Calling allocate on removed " +
"or non existant application " + applicationAttemptId);
return EMPTY_ALLOCATION;
}
{code}
EMPTY_ALLOCATION has 0 container. Another observation is that there seems to be
inconsistent synchronization on accessing the application map.
Suddenly be aware that [~djp] has started working on this problem. Please feel
free to take it over. Thanks!
> ResourceManager throws ArrayIndexOutOfBoundsException while handling
> CONTAINER_ALLOCATED for application attempt
> ----------------------------------------------------------------------------------------------------------------
>
> Key: YARN-292
> URL: https://issues.apache.org/jira/browse/YARN-292
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Affects Versions: 2.0.1-alpha
> Reporter: Devaraj K
> Assignee: Zhijie Shen
>
> {code:xml}
> 2012-12-26 08:41:15,030 ERROR
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler:
> Calling allocate on removed or non existant application
> appattempt_1356385141279_49525_000001
> 2012-12-26 08:41:15,031 ERROR
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in
> handling event type CONTAINER_ALLOCATED for applicationAttempt
> application_1356385141279_49525
> java.lang.ArrayIndexOutOfBoundsException: 0
> at java.util.Arrays$ArrayList.get(Arrays.java:3381)
> at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:655)
> at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:644)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:357)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:298)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:43)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:443)
> at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:490)
> at
> org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:80)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:433)
> at
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:414)
> at
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:126)
> at
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
> at java.lang.Thread.run(Thread.java:662)
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira