[
https://issues.apache.org/jira/browse/YARN-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109416#comment-15109416
]
Jian He commented on YARN-4496:
-------------------------------
bq. Do we need that change ?
This method actually uses this flag
{code}
public static RetryPolicy createRetryPolicy(Configuration conf,
boolean isHAEnabled) {
{code}
bq. a bit flaky due to reliance on the clock and timing
The goal is to test client can discover active rm fast. by fast, I need to test
the time interval is reasonably small.. I'll remove the 6 seconds sleep in
startRMInThread which should help this situation. I think 10 seconds overall
should be well enough. of course, i'm open to any suggestion
bq. Can we kill the RM threads before exiting the test case ?
Do you mean the startRMInThread in the test case? that thread just call a
single method.
> Improve HA ResourceManager Failover detection on the client
> -----------------------------------------------------------
>
> Key: YARN-4496
> URL: https://issues.apache.org/jira/browse/YARN-4496
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: client, resourcemanager
> Reporter: Arun Suresh
> Assignee: Jian He
> Attachments: YARN-4496.1.patch, YARN-4496.2.patch
>
>
> HDFS deployments can currently use the {{RequestHedgingProxyProvider}} to
> improve Namenode failover detection in the client. It does this by
> concurrently trying all namenodes and picks the namenode that returns the
> fastest with a successful response as the active node.
> It would be useful to have a similar ProxyProvider for the Yarn RM (it can
> possibly be done by converging some the class hierarchies to use the same
> ProxyProvider)
> This would especially be useful for large YARN deployments with multiple
> standby RMs where clients will be able to pick the active RM without having
> to traverse a list of configured RMs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)