[ 
https://issues.apache.org/jira/browse/YARN-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17253675#comment-17253675
 ] 

Xu Cang commented on YARN-10516:
--------------------------------

[~epayne] [~hexiaoqiao] [~Jim_Brennan]

Hi, would love to get some review on this, thank you

> In HA mode, when one Resource Manager has networking issue, getTokenService() 
> should not throw runtime exception
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-10516
>                 URL: https://issues.apache.org/jira/browse/YARN-10516
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: client
>            Reporter: Xu Cang
>            Priority: Minor
>         Attachments: YARN-10516.001.patch, YARN-10516.002.patch, 
> YARN-10516.003.patch, YARN-10516.004.patch, YARN-10516.007.patch
>
>
> We have observed one issue from YARN client around this piece of code:
> [https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/ClientRMProxy.java#L145]
>  
> While 
> {code:java}
> services.add(SecurityUtil.buildTokenService( yarnConf.getSocketAddr(address, 
> defaultAddr, defaultPort)) .toString());
>  
> {code}
> is being called,    buildTokenService()  fails and will throw runtime 
> exception, more specifically, UnknownHostException from here: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/SecurityUtil.java#L466]
>  *while one of the RM host was having networking issue* that IP cannot be 
> resolved.
> This runtime exception then floats all the way up to our application and 
> causes MR job submission failed. 
> In my opinion, since we have HA here, multiple RMs are still alive and 
> available. We should catch this exception in  getTokenService() and handle it 
> properly, instead of failing the whole action. 
>  
>  
> Would like to hear your opinion on this, if agreed, I will provide a patch on 
> this. Thank you.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to