[ https://issues.apache.org/jira/browse/YARN-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14680731#comment-14680731 ]
Wangda Tan commented on YARN-4002: ---------------------------------- [~xinxianyin], YARN-3091 is mostly for grouping all RM locking issues, some of critical items are resolved, other improvements are pending since they didn't become bottleneck yet. > make ResourceTrackerService.nodeHeartbeat more concurrent > --------------------------------------------------------- > > Key: YARN-4002 > URL: https://issues.apache.org/jira/browse/YARN-4002 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Hong Zhiguo > Assignee: Hong Zhiguo > Priority: Critical > > We have multiple RPC threads to handle NodeHeartbeatRequest from NMs. By > design the method ResourceTrackerService.nodeHeartbeat should be concurrent > enough to scale for large clusters. > But we have a "BIG" lock in NodesListManager.isValidNode which I think it's > unnecessary. > First, the fields "includes" and "excludes" of HostsFileReader are only > updated on "refresh nodes". All RPC threads handling node heartbeats are > only readers. So RWLock could be used to alow concurrent access by RPC > threads. > Second, since he fields "includes" and "excludes" of HostsFileReader are > always updated by "reference assignment", which is atomic in Java, the reader > side lock could just be skipped. -- This message was sent by Atlassian JIRA (v6.3.4#6332)