[
https://issues.apache.org/jira/browse/YARN-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681713#comment-14681713
]
Dustin Cote commented on YARN-3924:
-----------------------------------
Yes, [~ajithshetty] that's the point I'm trying to get across. The scenario
that is problematic is:
{quote}
Configuring wrong/invalid ha.rm-ids at client is user mistake, this can be
rechecked by user.
{quote}
Returning "Connection Refused" gives the user no information that this is what
happened. Generally, I see users looking for closed ports or firewall issues
when they see this message back, when really they've just forgotten to change
their Oozie workflow to point to a logical RM name after enabling HA. This
kind of error is doubly hard to debug when it works intermittently (because
when a failover occurs, suddenly their workflow starts working again!). Yes,
this is the current RM HA design, so it's not as easy as changing the message
or exception type. That said, I still think it's a good
supportability/usability improvement.
> Submitting an application to standby ResourceManager should respond better
> than Connection Refused
> --------------------------------------------------------------------------------------------------
>
> Key: YARN-3924
> URL: https://issues.apache.org/jira/browse/YARN-3924
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: resourcemanager
> Reporter: Dustin Cote
> Assignee: Ajith S
> Priority: Minor
>
> When submitting an application directly to a standby resource manager, the
> resource manager responds with 'Connection Refused' rather than indicating
> that it is a standby resource manager. Because the resource manager is aware
> of its own state, I feel like we can have the 8032 port open for standby
> resource managers and reject the request with something like 'Cannot process
> application submission from this standby resource manager'.
> This would be especially helpful for debugging oozie problems when users put
> in the wrong address for the 'jobtracker' (i.e. they don't put the logical RM
> address but rather point to a specific resource manager).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)