[jira] [Commented] (YARN-11000) Replace queue resource calculation logic in updateClusterResource

ASF GitHub Bot (Jira) Tue, 30 May 2023 05:42:50 -0700


    [ 
https://issues.apache.org/jira/browse/YARN-11000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17727505#comment-17727505
 ]


ASF GitHub Bot commented on YARN-11000:
---------------------------------------

brumi1024 commented on code in PR #3618:
URL: https://github.com/apache/hadoop/pull/3618#discussion_r1210167984


##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/AbstractCSQueue.java:
##########
@@ -413,8 +413,8 @@ protected void setupQueueConfigs(Resource clusterResource) 
throws
    */
   protected void setDynamicQueueProperties() {
     // Set properties from parent template
-    if (parent instanceof ParentQueue && isDynamicQueue()) {
-      ((ParentQueue) parent).getAutoCreatedQueueTemplate()
+    if (parent instanceof AbstractParentQueue && isDynamicQueue()) {
+      ((AbstractParentQueue) parent).getAutoCreatedQueueTemplate()

Review Comment:
   Nit: For the sake of clarity this could be limited to ParentQueue, as an 
AbstractManagedParentQueue cannot be dynamic, only a Parent can. And 
AbstractParentQueue is the superclass of AbstractManagedParent and Parent.



##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestKillApplicationWithRMHA.java:
##########
@@ -62,6 +63,9 @@ public void testKillAppWhenFailoverHappensAtNewState()
         new MockNM("127.0.0.1:1234", 15120, rm1.getResourceTrackerService());
     nm1.registerNode();
 
+    CapacitySchedulerTestUtilities.setupCapacityScheduler(rm1, 15, 8);
+    CapacitySchedulerTestUtilities.setupCapacityScheduler(rm2, 15, 8);

Review Comment:
   15, 8 are reused multiple times, please consider them moving to a variable.



##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/TestSchedulingWithAllocationRequestId.java:
##########
@@ -74,6 +76,8 @@ public void testMultipleAllocationRequestIds() throws 
Exception {
 
       MockNM nm1 = rm.registerNode("127.0.0.1:1234", 4 * GB);
       MockNM nm2 = rm.registerNode("127.0.0.2:5678", 4 * GB);
+
+      CapacityScheduler cs = 
CapacitySchedulerTestUtilities.setupCapacityScheduler(rm, 8);

Review Comment:
   Using giga as a prefix in the memory parameter of setupCapacityScheduler is 
misleading I think, for the following reason:
   
   Most of the tests use either a GB variable as a multiplier, or they write 
the whole number. This way only seeing the number tells us what resource are we 
talking about. Here on the other hand if I glance at it I can't determine if 
we're initing CS with a default memory value and 8 vcores, or 8 KB/MB/GB of 
memory. I would prefer if the setupCapacityScheduler didn't do the 
multiplication internally, and a value like this would be used: 8 * GB.
   



##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerTestUtilities.java:
##########
@@ -237,4 +243,67 @@ public static void checkNodeResourceUsage(int expected, 
NodeManager node) {
     Assert.assertEquals(expected, node.getUsed().getMemorySize());
     node.checkResourceUsage();
   }
+
+  public static CapacityScheduler setupCapacityScheduler(MockRM rm, int 
memoryGb) {
+    return setupCapacityScheduler(rm, memoryGb, 16);
+  }
+
+  public static CapacityScheduler setupCapacityScheduler(MockRM rm, int 
memoryGb, int cores) {
+    return setupCapacityScheduler(
+        rm,
+        Resource.newInstance(memoryGb * GB, cores),
+        Collections.emptyMap()
+    );
+  }
+
+  public static CapacityScheduler setupCapacityScheduler(MockRM rm, Resource 
resource) {
+    return setupCapacityScheduler(rm, resource, Collections.emptyMap());
+  }
+
+  public static CapacityScheduler setupCapacityScheduler(
+      MockRM rm, Resource resource, Map<String, String> nameToValues
+  ) {
+    if(!(rm.getResourceScheduler() instanceof CapacityScheduler)) {
+      return null;
+    }
+    NullRMNodeLabelsManager mgr = (NullRMNodeLabelsManager) 
rm.getRMContext().getNodeLabelManager();
+    CapacitySchedulerQueueCapacityHandler queueController =
+        new CapacitySchedulerQueueCapacityHandler(mgr);
+    CapacityScheduler cs = (CapacityScheduler) rm.getResourceScheduler();
+    Map<String, Long> others = new HashMap<>(nameToValues.size());
+    for (Map.Entry<String, String> nameToValue : nameToValues.entrySet()) {
+      others.put(nameToValue.getKey(), Long.valueOf(nameToValue.getValue()));
+    }
+    Resource clusterResource = Resource.newInstance(
+        resource.getMemorySize(), resource.getVirtualCores(), others);
+    mgr.setResourceForLabel(CommonNodeLabelsManager.NO_LABEL, clusterResource);
+    queueController.updateRoot(cs.getQueue("root"), clusterResource);
+    updateChildren(queueController, clusterResource, cs.getQueue("root"));
+    return cs;
+  }
+
+  public static void updateRootQueue(

Review Comment:
   From testing point of view I think a simple updateCSQueues or something 
similar would be more self-explanatory. That way updateRootQueue and 
updateChildren could be merged into one method, as the logic is quite simple: 
start from the root, and call the necessary update methods on all queues 
recursively.



##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerTestUtilities.java:
##########
@@ -68,11 +73,12 @@ public final class CapacitySchedulerTestUtilities {
   private CapacitySchedulerTestUtilities() {
   }
 
-  public static void setQueueHandler(CapacitySchedulerContext cs) {
+  public static CapacitySchedulerQueueManager 
setQueueHandler(CapacitySchedulerContext cs) {

Review Comment:
   Nit: I'm not sure the name describes the method anymore. 
CreateMockQueueHandler?



##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerAmbiguousLeafs.java:
##########
@@ -67,13 +68,10 @@ private ApplicationId submitApplication(MockRM rm, String 
queue)
 
   @Test
   public void testAmbiguousSubmissionWithACL() throws Exception {
-    YarnConfiguration conf = new YarnConfiguration();
-    conf.set(YarnConfiguration.RM_SCHEDULER, 
CapacityScheduler.class.getName());
-    conf.setBoolean(YarnConfiguration.YARN_ACL_ENABLE, true);
+    CapacitySchedulerConfiguration schedulerConf = new 
CapacitySchedulerConfiguration();
 
-    MockRM rm = new MockRM(conf);
-    CapacityScheduler cs = (CapacityScheduler)rm.getResourceScheduler();
-    CapacitySchedulerConfiguration schedulerConf = cs.getConfiguration();
+    schedulerConf.set(YarnConfiguration.RM_SCHEDULER, 
CapacityScheduler.class.getName());
+    schedulerConf.setBoolean(YarnConfiguration.YARN_ACL_ENABLE, true);

Review Comment:
   Nit: I know the new code is technically similar (because CSConf inherits 
YarnConf's properties), but RM_SCHEDULER and YARN_ACL_ENABLE are 
YarnConfiguration entries. For that reason I think the previous implementation 
- while more verbose - looks better. The reason for the change is probably the 
relocated setupCapacityScheduler call, but schedulerConf could be created with 
the `public CapacitySchedulerConfiguration(Configuration configuration)` 
constructor from YarnConfiguration.



##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacitySchedulerNewQueueAutoCreation.java:
##########
@@ -1264,7 +1272,9 @@ public void testAutoCreateInvalidParent() throws 
Exception {
 
   protected AbstractLeafQueue createQueue(String queuePath) throws 
YarnException,
       IOException {
-    return autoQueueHandler.createQueue(new QueuePath(queuePath));
+    AbstractLeafQueue result =  autoQueueHandler.createQueue(new 
QueuePath(queuePath));
+    cs = CapacitySchedulerTestUtilities.setupCapacityScheduler(mockRM, 1200);

Review Comment:
   cs is not used.



##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimitsByPartition.java:
##########
@@ -737,7 +744,7 @@ public RMNodeLabelsManager createNodeLabelManager() {
      *  AM resource percent config for queue B1 -> 0.15
      *        ==> 1Gb is max-am-resource-limit
      *
-     *  Only one app will be activated and all othe will be pending.
+     *  Only one app will be activated and all other will be pending.

Review Comment:
   Nit: other**s**



##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestAbsoluteResourceConfiguration.java:
##########
@@ -253,7 +253,7 @@ public void testSimpleMinMaxResourceConfigurartionPerQueue()
     rm.registerNode("127.0.0.1:1234", 250 * GB, 40);
 
     // Get queue object to verify min/max resource configuration.
-    CapacityScheduler cs = (CapacityScheduler) rm.getResourceScheduler();
+    CapacityScheduler cs = 
CapacitySchedulerTestUtilities.setupCapacityScheduler(rm, 250, 40);

Review Comment:
   SetupCapacityScheduler is reused for getting a CS object and/or initing it. 
These could be separated to an init method and a factory method, so that there 
are no calls to the setup which doesn't use the return value. Or maybe the 
factory method could return a "MockCS" which inherits most of the production 
CS's methods, but calls the necessary init method after a `reinitialize` call. 
This way there is no need to call setupCapacityScheduler over and over.



##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/RMHATestBase.java:
##########
@@ -218,5 +236,8 @@ protected void startRMs(MockRM rm1, Configuration 
confForRM1, MockRM rm2,
     rm1.adminService.transitionToActive(requestInfo);
     Assert.assertTrue(rm1.getRMContext().getHAServiceState()
         == HAServiceState.ACTIVE);
+    CapacitySchedulerTestUtilities.setupCapacityScheduler(rm1, 8, 8);
+    CapacitySchedulerTestUtilities.setupCapacityScheduler(rm1, 8, 8);

Review Comment:
   Why are there 2 setup calls for the same RM?



##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestApplicationLimits.java:
##########
@@ -851,7 +868,8 @@ public RMNodeLabelsManager createNodeLabelManager() {
     assertEquals(RMAppState.FAILED, app3.getState());
     assertEquals(
         "org.apache.hadoop.security.AccessControlException: "
-            + "Queue root.a.a1 already has 1 applications, cannot accept "
+            + "Queue root.a.a1 already has 1 applications, "
+            + "what is more than the max 1, so cannot accept "

Review Comment:
   This'll fail, I think this was left here accidentally.



##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/webapp/scheduler-response-PerUserResources.xml:
##########
@@ -16,7 +16,7 @@
         <usedCapacity>0.0</usedCapacity>
         <maxCapacity>100.0</maxCapacity>
         <absoluteCapacity>0.0</absoluteCapacity>
-        <absoluteMaxCapacity>0.0</absoluteMaxCapacity>
+        <absoluteMaxCapacity>100.0</absoluteMaxCapacity>

Review Comment:
   There are quite a lot changes in the asserts. What's the reason behind them?



##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestChildQueueOrder.java:
##########
@@ -295,38 +297,26 @@ public void testSortedQueues() throws Exception {
 
     // Assign {1,2,3,4} 1GB containers respectively to queues
     stubQueueAllocation(a, clusterResource, node_0, 1*GB);
-    stubQueueAllocation(b, clusterResource, node_0, 0*GB);
-    stubQueueAllocation(c, clusterResource, node_0, 0*GB);
-    stubQueueAllocation(d, clusterResource, node_0, 0*GB);

Review Comment:
   Why were these removed?





> Replace queue resource calculation logic in updateClusterResource
> -----------------------------------------------------------------
>
>                 Key: YARN-11000
>                 URL: https://issues.apache.org/jira/browse/YARN-11000
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacity scheduler
>            Reporter: Andras Gyori
>            Assignee: Andras Gyori
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> YARN-10965 introduces a brand new queue calculation system. In order to 
> simplify the review process, this issue replaces the current logic with the 
> newly introduced one.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-11000) Replace queue resource calculation logic in updateClusterResource

Reply via email to