[ 
https://issues.apache.org/jira/browse/YARN-11745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17914943#comment-17914943
 ] 

ASF GitHub Bot commented on YARN-11745:
---------------------------------------

Hean-Chhinling opened a new pull request, #7309:
URL: https://github.com/apache/hadoop/pull/7309

   ### Description of PR
   
   This PR addresses the TimSort contract violation issue in the 
`PriorityQueueComparator` class, which was identified when sorting queue with 
resource (0, 0) and queue resource(any number, any number). The comparator 
previously failed to maintain transitivity of TimSort, leading to 
`java.lang.IllegalArgumentException: Comparison method violates its general 
contract!` during sorting.
   
   **Root Cause**
   
   The issue occurred due to inconsistent comparison logic that violated the 
transitivity rules required by TimSort. 
   
   Specifically, at the following code lines the AND condition that only 
compare the resources if both queues' resources are not none. However, when one 
of the queue resource is `none or (0, 0)` and the other queue resource is `not 
none`  it skips this condition and go to compare based on the 
`absoluteCapacity` and when both of the queues' `absoluteCapacity` are the 
same, it leads to both queues equal each other even though their resources are 
different. 
   
   For more detail example of how this behaviour break the TimSort algorithm 
please see this attachment. 
[ExampleZeroQueueResourceProblem](https://issues.apache.org/jira/secure/attachment/13073555/ExampleZeroQueueResourceproblem.pdf
 )
   
   ```
      if (!minEffRes1.equals(Resources.none()) && !minEffRes2.equals(
               Resources.none())) {
             return minEffRes2.compareTo(minEffRes1);
           }
           
       float abs1 = q1Sort.absoluteCapacity;
       float abs2 = q2Sort.absoluteCapacity;
       return Float.compare(abs2, abs1);
   ```
   
   **Solution**
   
   Instead of checking if both queues' resource are not none. We should only 
check if _one of the queue's resource is not none_. This way to avoid skipping 
the queue resource comparison when we have one queue resource is not none and 
the other one is none. Specifically, change from AND condition to OR condition 
at the following codes:
   
   `
      if (!minEffRes1.equals(Resources.none()) || !minEffRes2.equals(
               Resources.none())) {
             return minEffRes2.compareTo(minEffRes1);
      }
   `
   
   **Testing**
   
   - Added a unit test 
`testPriorityQueueComparatorClassDoesNotViolateTimSortContract` to verify that 
sorting no longer throws `java.lang.IllegalArgumentException: Comparison method 
violates its general contract!`.
   
   - The test includes setting resource instance (0, 0) and resource(any 
number, any number) then shuffle the repeated queues that were created and then 
sort in using the `PriorityQueueComparator` class. It also mock the necessary 
elements, for example, `priority`  `label`, `absoluteUsedCapacity`, 
`usedCapacity` and `absoluteCapacity`. These elements can be of any number.
   
   **Impact**
   
   - Resolves the `TimSort` violation, ensuring stable and predictable sorting 
for `PriorityQueueComparator` class.
   - The `PriorityQueueComparator` sorting algorithm _may change_ in some 
behaviour when the queue resource is (0, 0).
   
   ### For code changes:
   
   - [X] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   
   




> YARN ResourceManager throws java.lang.IllegalArgumentExceptio: Comparison 
> method violates its general contract!
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-11745
>                 URL: https://issues.apache.org/jira/browse/YARN-11745
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn
>    Affects Versions: 3.4.0
>            Reporter: chhinlinghean
>            Assignee: chhinlinghean
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: ExampleZeroQueueResourceproblem.pdf
>
>
> The TimSort Transitivity rules got broken down when comparing both queues 
> with resources (0, 0), another queue with resources(some number, some number) 
> and with the same queues absolute capacity.
> *Steps reproduce with Unit test:*
> /hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/policy/TestPriorityUtilizationQueueOrderingPolicy.java
> {code:java}
> @Test
> public void testPriorityQueueComparatorClassDoesNotViolateTimSortContract() {
>   String partition = "testPartition";
>   
> List<PriorityUtilizationQueueOrderingPolicy.PriorityQueueResourcesForSorting> 
> queues = new ArrayList<>();
>   for (int i = 0; i < 300; i++) { // Have to be from 300 to make the test 
> deterministic
>     queues.add(createMockPriorityQueueResourcesForSorting(
>             partition, Resource.newInstance(0, 0)) // Need to be (0, 0)
>     );
>     queues.add(createMockPriorityQueueResourcesForSorting(
>             partition, Resource.newInstance(8, 20)) // Could be any number
>     );
>     queues.add(createMockPriorityQueueResourcesForSorting(
>             partition, Resource.newInstance(8, 8)) // Could be any number
>     );
>   }
>   Collections.shuffle(queues);
>   // java.lang.IllegalArgumentException: Comparison method violates its 
> general contract!
>   assertDoesNotThrow(() -> Collections.sort(queues, new 
> PriorityUtilizationQueueOrderingPolicy(true)
>           .new PriorityQueueComparator(partition)));
> }
> private 
> PriorityUtilizationQueueOrderingPolicy.PriorityQueueResourcesForSorting
>   createMockPriorityQueueResourcesForSorting(String partition, Resource 
> resource)
> {
>   QueueCapacities mockQueueCapacities = mock(QueueCapacities.class);
>   
> when(mockQueueCapacities.getAbsoluteUsedCapacity(partition)).thenReturn(3.2f);
>  // Could be any number
>   when(mockQueueCapacities.getUsedCapacity(partition)).thenReturn(1.0f); // 
> Could be any number
>   when(mockQueueCapacities.getAbsoluteCapacity(partition)).thenReturn(4.2f); 
> // Could be any number
>   CSQueue mockQueue = mock(CSQueue.class);
>   when(mockQueue.getQueueCapacities()).thenReturn(mockQueueCapacities);
>   when(mockQueue.getPriority()).thenReturn(Priority.newInstance(5)); // Could 
> be any number
>   
> when(mockQueue.getAccessibleNodeLabels()).thenReturn(Collections.singleton("label1"));
>  // simulated label
>   QueueResourceQuotas mockResourceQuotas = mock(QueueResourceQuotas.class);
>   
> when(mockResourceQuotas.getConfiguredMinResource(partition)).thenReturn(resource);
>   when(mockQueue.getQueueResourceQuotas()).thenReturn(mockResourceQuotas);
>   return new 
> PriorityUtilizationQueueOrderingPolicy.PriorityQueueResourcesForSorting(
>           mockQueue, partition
>   );{code}
> *How to fix it?*
> Instead of checking with an AND condition when both queues resources are not 
> none to compare its resources, we should check with an OR condition instead. 
> Because in the case one queue's resource is none another one is not we should 
> still compare by its resources.
> Previous code:
> {code:java}
> if (!minEffRes1.equals(Resources.none()) && 
> !minEffRes2.equals(Resources.none())) {
>   return minEffRes2.compareTo(minEffRes1);
> }
> float abs1 = q1Sort.absoluteCapacity;
> float abs2 = q2Sort.absoluteCapacity;
> return Float.compare(abs2, abs1); {code}
> Changed code to:
> {code:java}
> if (!minEffRes1.equals(Resources.none()) || 
> !minEffRes2.equals(Resources.none())) {
>   return minEffRes2.compareTo(minEffRes1);
> }
> float abs1 = q1Sort.absoluteCapacity;
> float abs2 = q2Sort.absoluteCapacity;
> return Float.compare(abs2, abs1); {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to