[
https://issues.apache.org/jira/browse/YARN-5139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274720#comment-16274720
]
Wangda Tan commented on YARN-5139:
----------------------------------
Thanks [~Tao Yang] for reporting this. I'm glad to see this runs in your prod
environment for half a year!
Yeah, please share the use cases of multiple nodes look up from your POV. We
can incorporate it once working on implementations.
In terms of the future work, despite bug fixes, they will be:
1) Global Scheduler Related Refactorings:
(YARN-7438). Additional changes to make SchedulingPlacementSet agnostic to
ResourceRequest / placement algorithm.
Wangda Tan
(YARN-7457). Delay scheduling should be an individual policy instead of part of
scheduler implementation.
(YARN-7496). Add muti node lookup support for better placement.
The main purpose is to make a more self-contained per-app placement allocator
algorithms. We're trying to close YARN-7438 soon, and [~sunilg] is working on
YARN-7496. If you have interests/bandwidth, you may take a crack at YARN-7457,
which is also crucial to make a clean separation of allocation algorithm.
2) Additional placement algorithms:
They're all located under YARN-6592 branch. Currently, most API patches are get
committed. And we're trying to finish simple intra-app affinity/anti-affinity
feature as the first use case of the scheduling constraint in the short term.
[~asuresh]/[~kkaranasos] are working on more advanced allocation algorithm
which can aggregate requests from different apps/services and run LP solver to
better place services with picky scheduling constraints.
Please let me know your suggestions and welcome to participate this work if you
have interest!
> [Umbrella] Move YARN scheduler towards global scheduler
> -------------------------------------------------------
>
> Key: YARN-5139
> URL: https://issues.apache.org/jira/browse/YARN-5139
> Project: Hadoop YARN
> Issue Type: New Feature
> Reporter: Wangda Tan
> Assignee: Wangda Tan
> Attachments: Explanantions of Global Scheduling (YARN-5139)
> Implementation.pdf, YARN-5139-Concurrent-scheduling-performance-report.pdf,
> YARN-5139-Global-Schedulingd-esign-and-implementation-notes-v2.pdf,
> YARN-5139-Global-Schedulingd-esign-and-implementation-notes.pdf,
> YARN-5139.000.patch, wip-1.YARN-5139.patch, wip-2.YARN-5139.patch,
> wip-3.YARN-5139.patch, wip-4.YARN-5139.patch, wip-5.YARN-5139.patch
>
>
> Existing YARN scheduler is based on node heartbeat. This can lead to
> sub-optimal decisions because scheduler can only look at one node at the time
> when scheduling resources.
> Pseudo code of existing scheduling logic looks like:
> {code}
> for node in allNodes:
> Go to parentQueue
> Go to leafQueue
> for application in leafQueue.applications:
> for resource-request in application.resource-requests
> try to schedule on node
> {code}
> Considering future complex resource placement requirements, such as node
> constraints (give me "a && b || c") or anti-affinity (do not allocate HBase
> regionsevers and Storm workers on the same host), we may need to consider
> moving YARN scheduler towards global scheduling.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]