[
https://issues.apache.org/jira/browse/YARN-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662100#comment-14662100
]
Wangda Tan commented on YARN-4002:
----------------------------------
Thanks for reporting, [~zhiguohong]. +1 for this proposal, we need to fix this
coarse synchronized lock.
> make ResourceTrackerService.nodeHeartbeat more concurrent
> ---------------------------------------------------------
>
> Key: YARN-4002
> URL: https://issues.apache.org/jira/browse/YARN-4002
> Project: Hadoop YARN
> Issue Type: Improvement
> Reporter: Hong Zhiguo
> Assignee: Hong Zhiguo
> Priority: Critical
>
> We have multiple RPC threads to handle NodeHeartbeatRequest from NMs. By
> design the method ResourceTrackerService.nodeHeartbeat should be concurrent
> enough to scale for large clusters.
> But we have a "BIG" lock in NodesListManager.isValidNode which I think it's
> unnecessary.
> First, the fields "includes" and "excludes" of HostsFileReader are only
> updated on "refresh nodes". All RPC threads handling node heartbeats are
> only readers. So RWLock could be used to alow concurrent access by RPC
> threads.
> Second, since he fields "includes" and "excludes" of HostsFileReader are
> always updated by "reference assignment", which is atomic in Java, the reader
> side lock could just be skipped.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)