[ 
https://issues.apache.org/jira/browse/YARN-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183324#comment-15183324
 ] 

Karthik Kambatla commented on YARN-4719:
----------------------------------------

bq. I may not understand about this, could you elaborate?
When we add or remove a node, a few things are updated:
# The map/set holding these nodes
# Total cluster capacity
# Total inflated cluster capacity (that takes allocated but not utilized 
resources into account)
# Largest container that could potentially be allocated

As long as it doesn't interfere with performance, I would like to keep all of 
this information consistent. That is if one thread has added a node, another 
thread shouldn't see a stale value for cluster capacity. Ensuring this 
consistency requires us to hold write locks when adding or removing a node, and 
read locks when accessing any of this information. Given that the current code 
has been ensuring consistency through a lock on the scheduler (at least 
FairScheduler), I doubt adding these read/write locks would lead to performance 
issues. I haven't done any benchmarking though. 

Do you agree with the approach of consistency-over-performance to begin with? 
If yes, we need the locks. And, making the HashMap a ConcurrentHashMap wouldn't 
buy us much. 

Coming to iterating over the nodes, I agree with your concern that we might 
proliferate ClusterNodeTracker with methods like addBlacklistedNodeIdsToList. 
And, understand your point that using a ConcurrentHashMap would allow us to 
iterate freely on the snapshot data without exposing the internals. That said, 
given that there are likely only a handful of cases where iterating through all 
the nodes is necessary and one of the goals for this library to help with 
preemption, node-labeling, and over-subscription, what do you think of the 
following construct:
{code}
  public List<NodeId> filter(NodeFilter nodeFilter) {
    List<NodeId> nodeIds = new ArrayList<>();
    readLock.lock();
    try {
      for (N node : nodes.values()) {
        if (nodeFilter.accept(node)) {
          nodeIds.add(node.getNodeID());
        }
      }
    } finally {
      readLock.unlock();
    }
    return nodeIds;
  }
{code}

{{addBlackListedNodeIdsToList}} can implement a trivial {{NodeFilter.accept}} 
as {{SchedulerAppUtils.isBlacklisted(app, node, LOG)}}. We have used this 
approach to great effect with PathFilter. 

Thoughts? 

> Add a helper library to maintain node state and allows common queries
> ---------------------------------------------------------------------
>
>                 Key: YARN-4719
>                 URL: https://issues.apache.org/jira/browse/YARN-4719
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: scheduler
>    Affects Versions: 2.8.0
>            Reporter: Karthik Kambatla
>            Assignee: Karthik Kambatla
>         Attachments: yarn-4719-1.patch, yarn-4719-2.patch, yarn-4719-3.patch
>
>
> The scheduler could use a helper library to maintain node state and allowing 
> matching/sorting queries. Several reasons for this:
> # Today, a lot of the node state management is done separately in each 
> scheduler. Having a single library will take us that much closer to reducing 
> duplication among schedulers.
> # Adding a filtering/matching API would simplify node labels and locality 
> significantly. 
> # An API that returns a sorted list for a custom comparator would help 
> YARN-1011 where we want to sort by allocation and utilization for 
> continuous/asynchronous and opportunistic scheduling respectively. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to