[
https://issues.apache.org/jira/browse/YARN-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183324#comment-15183324
]
Karthik Kambatla commented on YARN-4719:
----------------------------------------
bq. I may not understand about this, could you elaborate?
When we add or remove a node, a few things are updated:
# The map/set holding these nodes
# Total cluster capacity
# Total inflated cluster capacity (that takes allocated but not utilized
resources into account)
# Largest container that could potentially be allocated
As long as it doesn't interfere with performance, I would like to keep all of
this information consistent. That is if one thread has added a node, another
thread shouldn't see a stale value for cluster capacity. Ensuring this
consistency requires us to hold write locks when adding or removing a node, and
read locks when accessing any of this information. Given that the current code
has been ensuring consistency through a lock on the scheduler (at least
FairScheduler), I doubt adding these read/write locks would lead to performance
issues. I haven't done any benchmarking though.
Do you agree with the approach of consistency-over-performance to begin with?
If yes, we need the locks. And, making the HashMap a ConcurrentHashMap wouldn't
buy us much.
Coming to iterating over the nodes, I agree with your concern that we might
proliferate ClusterNodeTracker with methods like addBlacklistedNodeIdsToList.
And, understand your point that using a ConcurrentHashMap would allow us to
iterate freely on the snapshot data without exposing the internals. That said,
given that there are likely only a handful of cases where iterating through all
the nodes is necessary and one of the goals for this library to help with
preemption, node-labeling, and over-subscription, what do you think of the
following construct:
{code}
public List<NodeId> filter(NodeFilter nodeFilter) {
List<NodeId> nodeIds = new ArrayList<>();
readLock.lock();
try {
for (N node : nodes.values()) {
if (nodeFilter.accept(node)) {
nodeIds.add(node.getNodeID());
}
}
} finally {
readLock.unlock();
}
return nodeIds;
}
{code}
{{addBlackListedNodeIdsToList}} can implement a trivial {{NodeFilter.accept}}
as {{SchedulerAppUtils.isBlacklisted(app, node, LOG)}}. We have used this
approach to great effect with PathFilter.
Thoughts?
> Add a helper library to maintain node state and allows common queries
> ---------------------------------------------------------------------
>
> Key: YARN-4719
> URL: https://issues.apache.org/jira/browse/YARN-4719
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: scheduler
> Affects Versions: 2.8.0
> Reporter: Karthik Kambatla
> Assignee: Karthik Kambatla
> Attachments: yarn-4719-1.patch, yarn-4719-2.patch, yarn-4719-3.patch
>
>
> The scheduler could use a helper library to maintain node state and allowing
> matching/sorting queries. Several reasons for this:
> # Today, a lot of the node state management is done separately in each
> scheduler. Having a single library will take us that much closer to reducing
> duplication among schedulers.
> # Adding a filtering/matching API would simplify node labels and locality
> significantly.
> # An API that returns a sorted list for a custom comparator would help
> YARN-1011 where we want to sort by allocation and utilization for
> continuous/asynchronous and opportunistic scheduling respectively.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)