[
https://issues.apache.org/jira/browse/YARN-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15182591#comment-15182591
]
Wangda Tan commented on YARN-4719:
----------------------------------
Hi [~kasha],
bq. through addNode and removeNode so total_cluster_resources,
total_inflated_cluster_resources (for YARN-1011), max_cluster_resources are not
affected by other scheduler code.
I may not understand about this, could you elaborate?
For handle scheduler code to iterate nodes, we could either:
# Use concurrent map to avoid locking, and code will not break. Drawback: we
need to handle stale data.
# Expose lock to external caller, so scheduler can get readlock of
ClusterNodeTracker and do iteration. Drawback: iteration nodes and allocating
containers could lock ClusterNodeTracker for long time.
# Assume synchronize lock of scheduler will be acquired when make changes to
ClusterNodeTracker (like addNode, removeNode, etc.), and also when iterating
nodes. We don't need extra lock of returned node collections. Drawback: this
hides locks to external caller behaviors, and in the future scheduler could
remove synchronized lock to get better performance.
I would suggest to look at if #1 is doable (handle stale data and assumes
eventually consistency). #1 should have best performance and flexible to future
scheduler changes.
> Add a helper library to maintain node state and allows common queries
> ---------------------------------------------------------------------
>
> Key: YARN-4719
> URL: https://issues.apache.org/jira/browse/YARN-4719
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: scheduler
> Affects Versions: 2.8.0
> Reporter: Karthik Kambatla
> Assignee: Karthik Kambatla
> Attachments: yarn-4719-1.patch, yarn-4719-2.patch, yarn-4719-3.patch
>
>
> The scheduler could use a helper library to maintain node state and allowing
> matching/sorting queries. Several reasons for this:
> # Today, a lot of the node state management is done separately in each
> scheduler. Having a single library will take us that much closer to reducing
> duplication among schedulers.
> # Adding a filtering/matching API would simplify node labels and locality
> significantly.
> # An API that returns a sorted list for a custom comparator would help
> YARN-1011 where we want to sort by allocation and utilization for
> continuous/asynchronous and opportunistic scheduling respectively.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)