[jira] [Commented] (YARN-10380) Import logic of multi-node allocation in CapacityScheduler

zhuqi (Jira) Wed, 25 Nov 2020 22:09:04 -0800


    [ 
https://issues.apache.org/jira/browse/YARN-10380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17239076#comment-17239076
 ]


zhuqi commented on YARN-10380:
------------------------------

[~wangda] [~sunilg] [~BilwaST]

I have attached the draft patch, if you any advice.

Thanks.

 

> Import logic of multi-node allocation in CapacityScheduler
> ----------------------------------------------------------
>
>                 Key: YARN-10380
>                 URL: https://issues.apache.org/jira/browse/YARN-10380
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>    Affects Versions: 3.3.0, 3.4.0
>            Reporter: Wangda Tan
>            Assignee: zhuqi
>            Priority: Critical
>         Attachments: YARN-10380.001.patch
>
>
> *1) Entry point:* 
> When we do multi-node allocation, we're using the same logic of async 
> scheduling:
> {code:java}
> // Allocate containers of node [start, end)
>  for (FiCaSchedulerNode node : nodes) {
>   if (current++ >= start) {
>      if (shouldSkipNodeSchedule(node, cs, printSkipedNodeLogging)) {
>         continue;
>      }
>      cs.allocateContainersToNode(node.getNodeID(), false);
>   }
>  } {code}
> Is it the most effective way to do multi-node scheduling? Should we allocate 
> based on partitions? In above logic, if we have thousands of node in one 
> partition, we will repeatly access all nodes of the partition thousands of 
> times.
> I would suggest looking at making entry-point for node-heartbeat, 
> async-scheduling (single node), and async-scheduling (multi-node) to be 
> different.
> Node-heartbeat and async-scheduling (single node) can be still similar and 
> share most of the code. 
> async-scheduling (multi-node): should iterate partition first, using pseudo 
> code like: 
> {code:java}
> for (partition : all partitions) {
>   allocateContainersOnMultiNodes(getCandidate(partition))
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-10380) Import logic of multi-node allocation in CapacityScheduler

Reply via email to