[ https://issues.apache.org/jira/browse/YARN-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13701689#comment-13701689 ]
Bikas Saha commented on YARN-513: --------------------------------- Why does this still have an nm prefix? {code} - public static final String RESOURCEMANAGER_CONNECT_WAIT_SECS = - NM_PREFIX + "resourcemanager.connect.wait.secs"; - public static final int DEFAULT_RESOURCEMANAGER_CONNECT_WAIT_SECS = + public static final String RESOURCEMANAGER_CONNECT_MAX_WAIT_SECS = + NM_PREFIX + "resourcemanager.connect.max.wait.secs"; + public static final int DEFAULT_RESOURCEMANAGER_CONNECT_MAX_WAIT_SECS = 15*60; {code} The above probably also applies to YarnConfiguration.RESOURCEMANAGER_CONNECT_RETRY_INTERVAL_SECS. [~vinodkv] We should probably have renamed this conf address and port to match the others? {code} + if (protocol == ApplicationClientProtocol.class) { + return conf.getSocketAddr(YarnConfiguration.RM_ADDRESS, + YarnConfiguration.DEFAULT_RM_ADDRESS, + YarnConfiguration.DEFAULT_RM_PORT); {code} Why are MR and AMRMClient going through ClientRMProxy whereas YARNClient directly goes through RMProxy? {code} + try { + rmClient = + RMProxy.createRMProxy(getConfig(), ApplicationClientProtocol.class, + rmAddress); + } catch (IOException e) { {code} Typos in client and server rm proxy. {code} + String message = "Upsupported protocol found when creating the proxy " + + "connection to ResourceManager: " + + ((protocol != null) ? protocol.getClass().getName() : "null"); {code} Does ResourceTrackerClientPBImpl still need a close method after this code? If yes, then why does this code not call close() instead? {code} // NodeStatusUpdaterImpl.java + @VisibleForTesting + protected void stopRMProxy() { + if(this.resourceTracker != null) { + RPC.stopProxy(this.resourceTracker); + } + } {code} Does this need an @VisibleForTesting flag? {code} - protected ResourceTracker getRMClient() { + protected ResourceTracker getRMClient() throws IOException { {code} Why has this been removed in testConnectionNMToRM? Was this redundant earlier? Does some new test code cover this check? {code} MyNodeStatusUpdater4 myUpdater = (MyNodeStatusUpdater4) updater; - Assert.assertTrue("Updater was never started", - myUpdater.getWaitStartTime()>0); {code} Some of the tests probably dont need to wrap the fake ResourceTracker inside a RetryProxy. > Create common proxy client for communicating with RM > ---------------------------------------------------- > > Key: YARN-513 > URL: https://issues.apache.org/jira/browse/YARN-513 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Reporter: Bikas Saha > Assignee: Jian He > Attachments: YARN-513.10.patch, YARN-513.11.patch, YARN-513.12.patch, > YARN-513.13.patch, YARN-513.1.patch, YARN-513.2.patch, YARN-513.3.patch, > YARN-513.4.patch, YARN.513.5.patch, YARN-513.6.patch, YARN-513.7.patch, > YARN-513.8.patch, YARN-513.9.patch > > > When the RM is restarting, the NM, AM and Clients should wait for some time > for the RM to come back up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira