[ 
https://issues.apache.org/jira/browse/YARN-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836216#comment-13836216
 ] 

Xuan Gong commented on YARN-1028:
---------------------------------

Also:
1.The function RMProxy::createRetryPolicy() need to make some changes. 
Otherwise, we need to wait RESOURCEMANAGER_CONNECT_MAX_WAIT_MS(default is 15 
mins) to try to connect next RM. That does not sound right.

2. Maybe make ConfiguredFailoverProxyProvider as pluggable class ?

3. We wrap the RetryPolicies.failoverOnNetworkException with 
RetryPolicies.retryByException, so we will only do the retry on 
ConnectException and IOException ??? How about other exceptions, such as 
SocketException ? 

4. Since we create HA policy as RetryPolicies.failoverOnNetworkException. We 
might need to make some methods of ResourceTracker, ApplicationClientProtocol, 
ResourceManagerAdministrationProtocol and ApplicationMasterProtocol with 
idempotent annotation. Without  idempotent annotation, we can only do the retry 
on ConnectException, but will immediately fail on IOException and 
SocketException.  (FailoverOnNetworkExceptionRetry::shouldRetry())

> Add FailoverProxyProvider like capability to RMProxy
> ----------------------------------------------------
>
>                 Key: YARN-1028
>                 URL: https://issues.apache.org/jira/browse/YARN-1028
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Karthik Kambatla
>         Attachments: yarn-1028-1.patch, yarn-1028-draft-cumulative.patch
>
>
> RMProxy layer currently abstracts RM discovery and implements it by looking 
> up service information from configuration. Motivated by HDFS and using 
> existing classes from Common, we can add failover proxy providers that may 
> provide RM discovery in extensible ways.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to