[
https://issues.apache.org/jira/browse/YARN-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217294#comment-15217294
]
Rohith Sharma K S commented on YARN-4002:
-----------------------------------------
Hi [~zhiguohong]
In the latest patch YARN-4002-rwlock-v2.patch isValidNode return statement is
out of read lock. This should be inside readlock.
{code}
return (hostsList.isEmpty() || hostsList.contains(hostName) || hostsList
.contains(ip))
&& !(excludeList.contains(hostName) ||
excludeList.contains(ip));
{code}
Earlier patch 0001-YARN-4002.patch looks fine to me..
> make ResourceTrackerService.nodeHeartbeat more concurrent
> ---------------------------------------------------------
>
> Key: YARN-4002
> URL: https://issues.apache.org/jira/browse/YARN-4002
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Hong Zhiguo
> Assignee: Hong Zhiguo
> Priority: Critical
> Attachments: 0001-YARN-4002.patch, YARN-4002-lockless-read.patch,
> YARN-4002-rwlock-v2.patch, YARN-4002-rwlock.patch, YARN-4002-v0.patch
>
>
> We have multiple RPC threads to handle NodeHeartbeatRequest from NMs. By
> design the method ResourceTrackerService.nodeHeartbeat should be concurrent
> enough to scale for large clusters.
> But we have a "BIG" lock in NodesListManager.isValidNode which I think it's
> unnecessary.
> First, the fields "includes" and "excludes" of HostsFileReader are only
> updated on "refresh nodes". All RPC threads handling node heartbeats are
> only readers. So RWLock could be used to alow concurrent access by RPC
> threads.
> Second, since he fields "includes" and "excludes" of HostsFileReader are
> always updated by "reference assignment", which is atomic in Java, the reader
> side lock could just be skipped.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)