[ https://issues.apache.org/jira/browse/YARN-3419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dustin Cote updated YARN-3419: ------------------------------ Attachment: YARN-3419-1.patch Attaching a patch moving to a thread safe ArrayList implementation > ConcurrentModificationException in FSLeafQueue > ---------------------------------------------- > > Key: YARN-3419 > URL: https://issues.apache.org/jira/browse/YARN-3419 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn > Affects Versions: 2.5.0 > Reporter: Dustin Cote > Assignee: Dustin Cote > Attachments: YARN-3419-1.patch > > > Heavy Resource Manager use causes a manifestation of a > ConcurrentModificationException in FSLeafQueue. Doesn't look like > FSLeafQueue does anything except add, remove, traverse, and get sorted, so I > think we could use a CopyOnWriteArrayList that will use a bit more memory but > remove these exceptions. Seems to me that there will be relatively few app > adds compared to the number of traversals. Stack trace below: > 2015-03-27 00:47:34,773 ERROR > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in > handling event type CONTAINER_ALLOCATED for applicationAttempt > application_1427401429921_3388 > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) > at java.util.ArrayList$Itr.next(ArrayList.java:831) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java:147) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getHeadroom(FSAppAttempt.java:180) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.allocate(FairScheduler.java:923) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:929) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl$AMContainerAllocatedTransition.transition(RMAppAttemptImpl.java:922) > at > org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:757) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:110) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:765) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:746) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.3.4#6332)