[
https://issues.apache.org/jira/browse/YARN-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836216#comment-13836216
]
Xuan Gong commented on YARN-1028:
---------------------------------
Also:
1.The function RMProxy::createRetryPolicy() need to make some changes.
Otherwise, we need to wait RESOURCEMANAGER_CONNECT_MAX_WAIT_MS(default is 15
mins) to try to connect next RM. That does not sound right.
2. Maybe make ConfiguredFailoverProxyProvider as pluggable class ?
3. We wrap the RetryPolicies.failoverOnNetworkException with
RetryPolicies.retryByException, so we will only do the retry on
ConnectException and IOException ??? How about other exceptions, such as
SocketException ?
4. Since we create HA policy as RetryPolicies.failoverOnNetworkException. We
might need to make some methods of ResourceTracker, ApplicationClientProtocol,
ResourceManagerAdministrationProtocol and ApplicationMasterProtocol with
idempotent annotation. Without idempotent annotation, we can only do the retry
on ConnectException, but will immediately fail on IOException and
SocketException. (FailoverOnNetworkExceptionRetry::shouldRetry())
> Add FailoverProxyProvider like capability to RMProxy
> ----------------------------------------------------
>
> Key: YARN-1028
> URL: https://issues.apache.org/jira/browse/YARN-1028
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Bikas Saha
> Assignee: Karthik Kambatla
> Attachments: yarn-1028-1.patch, yarn-1028-draft-cumulative.patch
>
>
> RMProxy layer currently abstracts RM discovery and implements it by looking
> up service information from configuration. Motivated by HDFS and using
> existing classes from Common, we can add failover proxy providers that may
> provide RM discovery in extensible ways.
--
This message was sent by Atlassian JIRA
(v6.1#6144)