[jira] [Comment Edited] (YARN-10589) Improve logic of multi-node allocation

zhuqi (Jira) Mon, 01 Feb 2021 06:37:12 -0800


    [ 
https://issues.apache.org/jira/browse/YARN-10589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276369#comment-17276369
 ]


zhuqi edited comment on YARN-10589 at 2/1/21, 2:36 PM:
-------------------------------------------------------

[~ztang] [~tanu.ajmera]

I have reviewed the patch, i don't think the logic is needed, and it break the 
PRIORITY_SKIPPED logic.

In Schedule in priority order:
{code:java}
// Schedule in priority order
for (SchedulerRequestKey schedulerKey : application.getSchedulerKeys()) {
  ContainerAllocation result = allocate(clusterResource, candidates,
      schedulingMode, resourceLimits, schedulerKey, null);

  AllocationState allocationState = result.getAllocationState();
  if (allocationState == AllocationState.PRIORITY_SKIPPED) {
    continue;
  }
  return getCSAssignmentFromAllocateResult(clusterResource, result,
      null, node);
}


// will skip priority not meet
if (reservedContainer == null) {
  result = preCheckForNodeCandidateSet(clusterResource, node,
      schedulingMode, resourceLimits, schedulerKey);
  if (null != result) {
    continue;
  }
}
{code}
I think the original logic is right, in preCheckForNodeCandidateSet:
{code:java}
// Is the nodePartition of pending request matches the node's partition
// If not match, jump to next priority.
Optional<DiagnosticsCollector> dcOpt = activitiesManager == null ?
    Optional.empty() :
    activitiesManager.getOptionalDiagnosticsCollector();
if (!appInfo.precheckNode(schedulerKey, node, schedulingMode, dcOpt)) {
  ActivitiesLogger.APP.recordSkippedAppActivityWithoutAllocation(
      activitiesManager, node, application, schedulerKey,
      ActivityDiagnosticConstant.
          NODE_DO_NOT_MATCH_PARTITION_OR_PLACEMENT_CONSTRAINTS
          + ActivitiesManager.getDiagnostics(dcOpt),
      ActivityLevel.NODE);
  return ContainerAllocation.PRIORITY_SKIPPED;
}
{code}
We have skipped the priority, and the next priority may need this node 
partition. We just skipped the priority of application. 

If this partition not meet this priority, why we can think this partition not 
meet all priorities of the applications.

Any other thoughts?

 

 


was (Author: zhuqi):
[~ztang] [~tanu.ajmera]

I have reviewed the patch, i don't think the logic is needed, and it break the 
PRIORITY_SKIPPED logic.

In Schedule in priority order:
{code:java}
// Schedule in priority order
for (SchedulerRequestKey schedulerKey : application.getSchedulerKeys()) {
  ContainerAllocation result = allocate(clusterResource, candidates,
      schedulingMode, resourceLimits, schedulerKey, null);

  AllocationState allocationState = result.getAllocationState();
  if (allocationState == AllocationState.PRIORITY_SKIPPED) {
    continue;
  }
  return getCSAssignmentFromAllocateResult(clusterResource, result,
      null, node);
}

if (reservedContainer == null) {
  result = preCheckForNodeCandidateSet(clusterResource, node,
      schedulingMode, resourceLimits, schedulerKey);
  if (null != result) {
    continue;
  }
}
{code}
I think the original logic is right, in preCheckForNodeCandidateSet:
{code:java}
// Is the nodePartition of pending request matches the node's partition
// If not match, jump to next priority.
Optional<DiagnosticsCollector> dcOpt = activitiesManager == null ?
    Optional.empty() :
    activitiesManager.getOptionalDiagnosticsCollector();
if (!appInfo.precheckNode(schedulerKey, node, schedulingMode, dcOpt)) {
  ActivitiesLogger.APP.recordSkippedAppActivityWithoutAllocation(
      activitiesManager, node, application, schedulerKey,
      ActivityDiagnosticConstant.
          NODE_DO_NOT_MATCH_PARTITION_OR_PLACEMENT_CONSTRAINTS
          + ActivitiesManager.getDiagnostics(dcOpt),
      ActivityLevel.NODE);
  return ContainerAllocation.PRIORITY_SKIPPED;
}
{code}
We have skipped the priority, and the next priority may need this node 
partition. We just skipped the priority of application. 

If this partition not meet this priority, why we can think this partition not 
meet all priorities of the applications.

Any other thoughts?

 

 

> Improve logic of multi-node allocation
> --------------------------------------
>
>                 Key: YARN-10589
>                 URL: https://issues.apache.org/jira/browse/YARN-10589
>             Project: Hadoop YARN
>          Issue Type: Task
>    Affects Versions: 3.3.0
>            Reporter: Tanu Ajmera
>            Assignee: Tanu Ajmera
>            Priority: Major
>         Attachments: YARN-10589-001.patch
>
>
> {code:java}
> for (String partititon : partitions) {
>  if (current++ > start) {
>  break;
>  }
>  CandidateNodeSet<FiCaSchedulerNode> candidates =
>  cs.getCandidateNodeSet(partititon);
>  if (candidates == null) {
>  continue;
>  }
>  cs.allocateContainersToNode(candidates, false);
> }{code}
> In above logic, if we have thousands of node in one partition, we will still 
> repeatedly access all nodes of the partition thousands of times. There is no 
> break point where if the partition is not same for the first node, it should 
> stop checking other nodes in that partition.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (YARN-10589) Improve logic of multi-node allocation

Reply via email to