[ https://issues.apache.org/jira/browse/YARN-9730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16938302#comment-16938302 ]
Bibin Chundatt edited comment on YARN-9730 at 9/26/19 5:58 AM: --------------------------------------------------------------- [~jhung] Thank you for working on this. Sorry to come in really late too .. {quote} 240 if (ResourceRequest.ANY.equals(req.getResourceName())) { 241 SchedulerUtils.enforcePartitionExclusivity(req, 242 getRmContext().getExclusiveEnforcedPartitions(), 243 asc.getNodeLabelExpression()); 244 } {quote} Configuration query on the AM allocation flow is going to be costly which i observed while evaluating the performance.. Could you optimize {{getRmContext().getExclusiveEnforcedPartitions()}}, since this is going to be invoked for every *request* was (Author: bibinchundatt): [~jhung] Thank you for working on this. Sorry to come in really late too .. {quote} 240 if (ResourceRequest.ANY.equals(req.getResourceName())) { 241 SchedulerUtils.enforcePartitionExclusivity(req, 242 getRmContext().getExclusiveEnforcedPartitions(), 243 asc.getNodeLabelExpression()); 244 } {quote} Configuration query on the AM allocation flow is going to be costly which i observed while evaluating the performance.. Could you optimize {getRmContext().getExclusiveEnforcedPartitions()} ,since this is going to be invoked for every *request* > Support forcing configured partitions to be exclusive based on app node label > ----------------------------------------------------------------------------- > > Key: YARN-9730 > URL: https://issues.apache.org/jira/browse/YARN-9730 > Project: Hadoop YARN > Issue Type: Task > Reporter: Jonathan Hung > Assignee: Jonathan Hung > Priority: Major > Labels: release-blocker > Fix For: 2.10.0, 3.3.0, 3.2.2, 3.1.4 > > Attachments: YARN-9730-branch-2.001.patch, YARN-9730.001.addendum, > YARN-9730.001.patch, YARN-9730.002.addendum, YARN-9730.002.patch, > YARN-9730.003.patch > > > Use case: queue X has all of its workload in non-default (exclusive) > partition P (by setting app submission context's node label set to P). Node > in partition Q != P heartbeats to RM. Capacity scheduler loops through every > application in X, and every scheduler key in this application, and fails to > allocate each time since the app's requested label and the node's label don't > match. This causes huge performance degradation when number of apps in X is > large. > To fix the issue, allow RM to configure partitions as "forced-exclusive". If > partition P is "forced-exclusive", then: > * 1a. If app sets its submission context's node label to P, all its resource > requests will be overridden to P > * 1b. If app sets its submission context's node label Q, any of its resource > requests whose labels are P will be overridden to Q > * 2. In the scheduler, we add apps with node label expression P to a > separate data structure. When a node in partition P heartbeats to scheduler, > we only try to schedule apps in this data structure. When a node in partition > Q heartbeats to scheduler, we schedule the rest of the apps as normal. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org