[ 
https://issues.apache.org/jira/browse/YARN-11838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18011553#comment-18011553
 ] 

ASF GitHub Bot commented on YARN-11838:
---------------------------------------

violetnspct commented on PR #7828:
URL: https://github.com/apache/hadoop/pull/7828#issuecomment-3146269577

   @shameersss1 Should you be adding unit tests to cover the following two edge 
cases? Or those are already covered?
   
   1.  Lock acquisition failure. Important because lock acquisition could fail 
in high contention.
   2. Exception during locked section. Important to verify lock release in 
error conditions




> YARN ConcurrentModificationException When Refreshing Node Attributes
> --------------------------------------------------------------------
>
>                 Key: YARN-11838
>                 URL: https://issues.apache.org/jira/browse/YARN-11838
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodeattibute, yarn
>    Affects Versions: 3.3.0, 3.2.1
>            Reporter: Syed Shameerur Rahman
>            Assignee: Syed Shameerur Rahman
>            Priority: Major
>              Labels: pull-request-available
>
> h2. The Problem Flow
>  # A new node is being added to the cluster (NODE_ADDED event)
>  # The CapacityScheduler calls addNode() method
>  # This triggers refreshNodeAttributesToScheduler() in 
> NodeAttributesManagerImpl
>  # During this process, the code attempts to convert a HashMap to a string 
> for logging
>  # While iterating through the HashMap for string conversion, another thread 
> modifies the same HashMap : 
> https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/NodeAttributesManagerImpl.java#L748
>  # This causes the ConcurrentModificationException
>  
> {code:java}
> 025-07-17 19:23:37,166 ERROR org.apache.hadoop.yarn.event.EventDispatcher 
> (SchedulerEventDispatcher:Event Processor): Error in handling event type 
> NODE_ADDED to the Event Dispatcher
> java.util.ConcurrentModificationException
>         at 
> java.base/java.util.HashMap$HashIterator.nextNode(HashMap.java:1597)
>         at java.base/java.util.HashMap$KeyIterator.next(HashMap.java:1620)
>         at 
> java.base/java.util.AbstractCollection.toString(AbstractCollection.java:456)
>         at java.base/java.lang.String.valueOf(String.java:4220)
>         at java.base/java.lang.StringBuilder.append(StringBuilder.java:173)
>         at java.base/java.util.AbstractMap.toString(AbstractMap.java:555)
>         at java.base/java.lang.String.valueOf(String.java:4220)
>         at java.base/java.lang.StringBuilder.append(StringBuilder.java:173)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.nodelabels.NodeAttributesManagerImpl.refreshNodeAttributesToScheduler(NodeAttributesManagerImpl.java:748)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to