[
https://issues.apache.org/jira/browse/YARN-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541328#comment-14541328
]
zhihai xu commented on YARN-3619:
---------------------------------
I attached a test patch which can reproduce this issue with the following stack
trace:
{code}
Running
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainerMetrics
Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.92 sec <<<
FAILURE! - in
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainerMetrics
testContainerMetricsFinished(org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainerMetrics)
Time elapsed: 1.194 sec <<< ERROR!
java.util.ConcurrentModificationException: null
at
java.util.LinkedHashMap$LinkedHashIterator.nextEntry(LinkedHashMap.java:394)
at java.util.LinkedHashMap$EntryIterator.next(LinkedHashMap.java:413)
at java.util.LinkedHashMap$EntryIterator.next(LinkedHashMap.java:412)
at
org.apache.hadoop.metrics2.impl.MetricsSystemImpl.sampleMetrics(MetricsSystemImpl.java:403)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainerMetrics.testContainerMetricsFinished(TestContainerMetrics.java:144)
Results :
Tests in error:
TestContainerMetrics.testContainerMetricsFinished:144 ยป ConcurrentModification
Tests run: 3, Failures: 0, Errors: 1, Skipped: 0
{code}
> ContainerMetrics unregisters during getMetrics and leads to
> ConcurrentModificationException
> -------------------------------------------------------------------------------------------
>
> Key: YARN-3619
> URL: https://issues.apache.org/jira/browse/YARN-3619
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 2.7.0
> Reporter: Jason Lowe
> Assignee: zhihai xu
> Attachments: test.patch
>
>
> ContainerMetrics is able to unregister itself during the getMetrics method,
> but that method can be called by MetricsSystemImpl.sampleMetrics which is
> trying to iterate the sources. This leads to a
> ConcurrentModificationException log like this:
> {noformat}
> 2015-05-11 14:00:20,360 [Timer for 'NodeManager' metrics system] WARN
> impl.MetricsSystemImpl: java.util.ConcurrentModificationException
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)