[ https://issues.apache.org/jira/browse/YARN-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13541592#comment-13541592 ]
shenhong commented on YARN-301: ------------------------------- The reason is when the thread SchedulerEventDispatche assignContainer, it will get priorities from AppSchedulingInfo, at AppSchedulingInfo the code is: synchronized public Collection<Priority> getPriorities() { return priorities; } but it just get the reference of priorities, in AppSchedulable#assignContainer, it traverse the priorities. // (not scheduled) in order to promote better locality. for (Priority priority : app.getPriorities()) { app.addSchedulingOpportunity(priority); ... On the other hand, when the RM processing the request from AM and update the priorities at AppSchedulingInfo#updateResourceRequests: if (asks == null) { asks = new HashMap<String, ResourceRequest>(); this.requests.put(priority, asks); this.priorities.add(priority); } else if (updatePendingResources) { it turn out to concurrentModificationException. > Fairscheduler appear to concurrentModificationException and RM crash > -------------------------------------------------------------------- > > Key: YARN-301 > URL: https://issues.apache.org/jira/browse/YARN-301 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler > Reporter: shenhong > Fix For: 2.0.3-alpha > > > In my test cluster, fairscheduler appear to concurrentModificationException > and RM crash, here is the message: > 2012-12-30 17:14:17,171 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in > handling event type NODE_UPDATE to the scheduler > java.util.ConcurrentModificationException > at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100) > at java.util.TreeMap$KeyIterator.next(TreeMap.java:1154) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable.assignContainer(AppSchedulable.java:297) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:181) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:780) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:842) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:98) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:340) > at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira