[
https://issues.apache.org/jira/browse/YARN-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16971523#comment-16971523
]
kailiu_dev edited comment on YARN-8373 at 11/11/19 1:23 PM:
------------------------------------------------------------
Dear, [~wilfreds] ,
why you use {color:#000000}readLock.lock() below this code in your patch, I
konw that it has used readLock{color} inside sortedNodeList, and when
sortedNodeList can avoid node change , beause node add or node remove or node
resource change is wirteLock , they will do not work in time
readLock.lock();
+ try {
nodeIdList = nodeTracker.sortedNodeList(nodeAvailableResourceComparator);
+ } finally {
+ readLock.unlock();
}
was (Author: kailiu_dev):
Dear, [~wilfreds] ,
why you use {color:#000000}readLock.lock() below this code in your patch, I
konw that it has used {color:#000000}readLock{color} inside sortedNodeList, and
when sortedNodeList can avoid node change , beause node add or node remove or
node resource change is wirteLock , they will do not work in time{color}
{color:#000000}readLock.lock();
+ try {
nodeIdList = nodeTracker.sortedNodeList(nodeAvailableResourceComparator);
+ } finally {
+ readLock.unlock();
}{color}
> RM Received RMFatalEvent of type CRITICAL_THREAD_CRASH
> -------------------------------------------------------
>
> Key: YARN-8373
> URL: https://issues.apache.org/jira/browse/YARN-8373
> Project: Hadoop YARN
> Issue Type: Bug
> Components: fairscheduler, resourcemanager
> Affects Versions: 2.9.0
> Reporter: Girish Bhat
> Assignee: Wilfred Spiegelenburg
> Priority: Major
> Labels: newbie
> Attachments: YARN-8373.001.patch, YARN-8373.002.patch,
> YARN-8373.003.patch
>
>
>
>
> {noformat}
> sudo -u yarn /usr/local/hadoop/latest/bin/yarn version Hadoop 2.9.0
> Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r
> 756ebc8394e473ac25feac05fa493f6d612e6c50 Compiled by arsuresh on
> 2017-11-13T23:15Z Compiled with protoc 2.5.0 From source with checksum
> 0a76a9a32a5257331741f8d5932f183 This command was run using
> /usr/local/hadoop/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar{noformat}
> This is for version 2.9.0
>
> {noformat}
> 2018-05-25 05:53:12,742 ERROR
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Received
> RMFatalEvent of type CRITICAL_THREAD_CRASH, caused by a critical thread, Fai
> rSchedulerContinuousScheduling, that exited unexpectedly:
> java.lang.IllegalArgumentException: Comparison method violates its general
> contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,743 FATAL
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Shutting down
> the resource manager.
> 2018-05-25 05:53:12,749 INFO org.apache.hadoop.util.ExitUtil: Exiting with
> status 1: a critical thread, FairSchedulerContinuousScheduling, that exited
> unexpectedly: java.lang.IllegalArgumentException: Comparison method violates
> its general contract!
> at java.util.TimSort.mergeHi(TimSort.java:899)
> at java.util.TimSort.mergeAt(TimSort.java:516)
> at java.util.TimSort.mergeForceCollapse(TimSort.java:457)
> at java.util.TimSort.sort(TimSort.java:254)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1454)
> at java.util.Collections.sort(Collections.java:175)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.ClusterNodeTracker.sortedNodeList(ClusterNodeTracker.java:340)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:907)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)
> 2018-05-25 05:53:12,772 ERROR
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
> ExpiredTokenRemover received java.lang.InterruptedException: sleep
> interrupted{noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]