[
https://issues.apache.org/jira/browse/YARN-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17247393#comment-17247393
]
Xu Cang commented on YARN-10516:
--------------------------------
[~epayne] [~hexiaoqiao] would you please review this Jira and patch? thanks!
> In HA mode, when one Resource Manager has networking issue, getTokenService()
> should not throw runtime exception
> ----------------------------------------------------------------------------------------------------------------
>
> Key: YARN-10516
> URL: https://issues.apache.org/jira/browse/YARN-10516
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: client
> Reporter: Xu Cang
> Priority: Minor
> Attachments: YARN-10516.001.patch, YARN-10516.002.patch,
> YARN-10516.003.patch, YARN-10516.004.patch, YARN-10516.007.patch
>
>
> We have observed one issue from YARN client around this piece of code:
> [https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ClientRMProxy.java#L145]
>
> While
> {code:java}
> services.add(SecurityUtil.buildTokenService( yarnConf.getSocketAddr(address,
> defaultAddr, defaultPort)) .toString());
>
> {code}
> is being called, buildTokenService() fails and will throw runtime
> exception, more specifically, UnknownHostException from here:
> [https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SecurityUtil.java#L466]
> *while one of the RM host was having networking issue* that IP cannot be
> resolved.
> This runtime exception then floats all the way up to our application and
> causes MR job submission failed.
> In my opinion, since we have HA here, multiple RMs are still alive and
> available. We should catch this exception in getTokenService() and handle it
> properly, instead of failing the whole action.
>
>
> Would like to hear your opinion on this, if agreed, I will provide a patch on
> this. Thank you.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]