[ 
https://issues.apache.org/jira/browse/YARN-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16970251#comment-16970251
 ] 

Wilfred Spiegelenburg commented on YARN-9940:
---------------------------------------------

The changes you are making are also not helpful in hadoop 2.7. When you 
synchronise a method you do the same as placing all the code inside the method 
within a synchronised block. There is only one synchronised method or code 
block that can run at any time in the same class.
 These two code samples are effectively the same:
{code:java}
public synchronized myMethod() {
 all my code here...
}
{code}
and
{code:java}
public myMethod() {
 synchronized (this)  {
     all my code here...
  }
}
{code}
Since the method {{FairScheduler.completedContainer()}} is already synchronised 
your change of adding a block that is synchronised inside the method does not 
make a difference. It will be optimised away by the compiler.

Hadoop 2.7.2 does not have read/write locks in the scheduler at all so I don't 
know what version you are running but it is not hadoop 2.7. The read/write 
locks were introduced in YARN-3139 which is only in hadoop 2.9 and later.  Same 
for the line numbers in the stack they do not line up with the 2.7 release.

As per YARN-8373:
 - the test is not really testing anything as it holds the scheduler lock while 
calling the deductUnallocatedResource() this does not happen in the real code 
and should not be there.
 - the best solution is to move to a PriorityQueue for the sorted list that 
really fixes the issue as the test shows without the lock in place.

> avoid continuous scheduling thread crashes while sorting nodes get 
> 'Comparison method violates its general contract'
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-9940
>                 URL: https://issues.apache.org/jira/browse/YARN-9940
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 2.7.2
>            Reporter: kailiu_dev
>            Assignee: kailiu_dev
>            Priority: Major
>         Attachments: YARN-9940-branch-2.7.2.001.patch
>
>
> 2019-10-16 09:14:51,215 ERROR 
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread 
> Thread[FairSchedulerContinuousScheduling,5,main] threw an Exception.
> java.lang.IllegalArgumentException: Comparison method violates its general 
> contract!
>         at java.util.TimSort.mergeHi(TimSort.java:868)
>         at java.util.TimSort.mergeAt(TimSort.java:485)
>         at java.util.TimSort.mergeForceCollapse(TimSort.java:426)
>         at java.util.TimSort.sort(TimSort.java:223)
>         at java.util.TimSort.sort(TimSort.java:173)
>         at java.util.Arrays.sort(Arrays.java:659)
>         at java.util.Collections.sort(Collections.java:217)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.continuousSchedulingAttempt(FairScheduler.java:1117)
>         at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$ContinuousSchedulingThread.run(FairScheduler.java:296)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to