[ https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681713#comment-14681713 ]
Dustin Cote commented on YARN-3924: ----------------------------------- Yes, [~ajithshetty] that's the point I'm trying to get across. The scenario that is problematic is: {quote} Configuring wrong/invalid ha.rm-ids at client is user mistake, this can be rechecked by user. {quote} Returning "Connection Refused" gives the user no information that this is what happened. Generally, I see users looking for closed ports or firewall issues when they see this message back, when really they've just forgotten to change their Oozie workflow to point to a logical RM name after enabling HA. This kind of error is doubly hard to debug when it works intermittently (because when a failover occurs, suddenly their workflow starts working again!). Yes, this is the current RM HA design, so it's not as easy as changing the message or exception type. That said, I still think it's a good supportability/usability improvement. > Submitting an application to standby ResourceManager should respond better > than Connection Refused > -------------------------------------------------------------------------------------------------- > > Key: YARN-3924 > URL: https://issues.apache.org/jira/browse/YARN-3924 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager > Reporter: Dustin Cote > Assignee: Ajith S > Priority: Minor > > When submitting an application directly to a standby resource manager, the > resource manager responds with 'Connection Refused' rather than indicating > that it is a standby resource manager. Because the resource manager is aware > of its own state, I feel like we can have the 8032 port open for standby > resource managers and reject the request with something like 'Cannot process > application submission from this standby resource manager'. > This would be especially helpful for debugging oozie problems when users put > in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM > address but rather point to a specific resource manager). -- This message was sent by Atlassian JIRA (v6.3.4#6332)