[ https://issues.apache.org/jira/browse/YARN-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18010006#comment-18010006 ]
ASF GitHub Bot commented on YARN-11838: --------------------------------------- sjlee commented on PR #7828: URL: https://github.com/apache/hadoop/pull/7828#issuecomment-3119587988 Thanks for the follow-up, @shameersss1. Looking at the exception stack trace from the JIRA, it is `host.attributes` that was concurrently modified, isn't it? It is `host.attributes` that needs to be protected. `newNodeToAttributesMap` is not a factor as it is a local variable and thus not shared. And, it is also the logging line itself (l.748) that is an unsafe read. IMO, the right fix is to protect the logging line (l.748) with the read lock. Please let me know. > YARN ConcurrentModificationException When Refreshing Node Attributes > -------------------------------------------------------------------- > > Key: YARN-11838 > URL: https://issues.apache.org/jira/browse/YARN-11838 > Project: Hadoop YARN > Issue Type: Bug > Components: nodeattibute, yarn > Affects Versions: 3.3.0, 3.2.1 > Reporter: Syed Shameerur Rahman > Assignee: Syed Shameerur Rahman > Priority: Major > Labels: pull-request-available > > h2. The Problem Flow > # A new node is being added to the cluster (NODE_ADDED event) > # The CapacityScheduler calls addNode() method > # This triggers refreshNodeAttributesToScheduler() in > NodeAttributesManagerImpl > # During this process, the code attempts to convert a HashMap to a string > for logging > # While iterating through the HashMap for string conversion, another thread > modifies the same HashMap : > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/NodeAttributesManagerImpl.java#L748 > # This causes the ConcurrentModificationException > > {code:java} > 025-07-17 19:23:37,166 ERROR org.apache.hadoop.yarn.event.EventDispatcher > (SchedulerEventDispatcher:Event Processor): Error in handling event type > NODE_ADDED to the Event Dispatcher > java.util.ConcurrentModificationException > at > java.base/java.util.HashMap$HashIterator.nextNode(HashMap.java:1597) > at java.base/java.util.HashMap$KeyIterator.next(HashMap.java:1620) > at > java.base/java.util.AbstractCollection.toString(AbstractCollection.java:456) > at java.base/java.lang.String.valueOf(String.java:4220) > at java.base/java.lang.StringBuilder.append(StringBuilder.java:173) > at java.base/java.util.AbstractMap.toString(AbstractMap.java:555) > at java.base/java.lang.String.valueOf(String.java:4220) > at java.base/java.lang.StringBuilder.append(StringBuilder.java:173) > at > org.apache.hadoop.yarn.server.resourcemanager.nodelabels.NodeAttributesManagerImpl.refreshNodeAttributesToScheduler(NodeAttributesManagerImpl.java:748) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org