[
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16346992#comment-16346992
]
Sunil G commented on YARN-7494:
-------------------------------
Thanks [~leftnoteasy] [~cheersyang] [~Tao Yang] for comments.
Overall i ll summarize the suggestions including my thoughts
# *multi-node-lookup* is enabled in cluster level now, we could also make it
enable at application level and in other subsequent levels. I will make use of
scheduling env's for this now. I think lets not add a new element in
RegisterApplicationMasterRequest as it might need a bit of changes from
Applications. Instead we can use the scheduling env added in other patch. Once
we have the type from app (as app level OR queue OR cluster), we will pass this
to a factory to get correct child class of {{CandidateNodeSet (Simple/Partition
Based)}}
** Expose a node lookup SCOPE option from app in scheduling env as
[SCOPE:APP/QUEUE/CLUSTER].
** SCOPE as APP level will enable multi-node-lookup-policy which will be
explained in section2. SCOPE as QUEUE will fetch default config of
multi-node-placement-enabled at each queue. CLUSTER means the value of
yarn.capacity.scheduler.multi-node-placement-enabled.
** SCOPE enables option to lookup in multiple nodes. Given SCOPE as QUEUE, and
at QUEUE level multi-node-lookup is disabled, then we will still look at one
node at a time.
# {{yarn.capacity.sorting-nodes.policy.class}} at cluster level/queue
level/app level gives flexibility to choose correct node lookup policy given
multi-node-placement-enabled is enabled in each level. So as [~cheersyang]
mentioned, app can override queue level policy.
# Given we have the abstraction to select {{MultiNodePolicy}} , sorting
optimization could be done at a central manager. I initially thought abt this
to avoid computation cost, however had some concerns.
** Each time when a node is added/removed or capacity change happens, we need
to always refresh the node set. Its not desirable to have a timer and refresh
periodically as stale data for such a critical DS is not good design
** Number of nodes in a cluster always goes up, hence we may have some
duplicated copy for each policy (given app level policy).
# *[Proposal for #3]* Hence we can think abt an interim layer. We already have
{{ClusterNodeTracker}} and {{NodeFilter}} interface. Hence we can query to this
manager with any kind of Filter we need.
## Now each MultiNodePolicy (NodeUsageBasedPolicy OR running container based
etc) will have a reference of original nodes retrieved from
{{ClusterNodeTracker#getNodes}}(NodeFilter). A master {{map <MultiNodePolicy,
Set<SchedulerNode>>}} will be the master cache. This cache will be invalidate
on a node change event.
## Since we have a master cache, each app's MultiNodePolicy will just fetch
the reference from master map (NodeUsageBasedPolicy will have its entry of
nodes sorted in that mode)
## Invalidating cache is tricky. I ll improve ClusterNodeTracker to register
a call back to invalidate master cache.
[~cheersyang] [~leftnoteasy] [~Tao Yang], pls check this and share your
thoughts. Once we have consensus i ll change my patch. Or if a call is needed,
we can quickly plan that same also.
> Add muti node lookup support for better placement
> -------------------------------------------------
>
> Key: YARN-7494
> URL: https://issues.apache.org/jira/browse/YARN-7494
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: capacity scheduler
> Reporter: Sunil G
> Assignee: Sunil G
> Priority: Major
> Attachments: YARN-7494.001.patch, YARN-7494.v0.patch,
> YARN-7494.v1.patch
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup
> based on partition to start with.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]