[jira] [Commented] (YARN-4496) Improve HA ResourceManager Failover detection on the client

Jian He (JIRA) Wed, 20 Jan 2016 13:08:07 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109416#comment-15109416
 ]


Jian He commented on YARN-4496:
-------------------------------

bq. Do we need that change ? 
This method actually uses this flag 
{code}
  public static RetryPolicy createRetryPolicy(Configuration conf,
      boolean isHAEnabled) {
{code}
bq.  a bit flaky due to reliance on the clock and timing
The goal is to test client can discover active rm fast. by fast, I need to test 
the time interval is reasonably small.. I'll remove the 6 seconds sleep in 
startRMInThread which should help this situation. I think 10 seconds overall 
should be well enough. of course, i'm open to any suggestion 
bq. Can we kill the RM threads before exiting the test case ?
Do you mean the startRMInThread in the test case? that thread just call a 
single method.

> Improve HA ResourceManager Failover detection on the client
> -----------------------------------------------------------
>
>                 Key: YARN-4496
>                 URL: https://issues.apache.org/jira/browse/YARN-4496
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: client, resourcemanager
>            Reporter: Arun Suresh
>            Assignee: Jian He
>         Attachments: YARN-4496.1.patch, YARN-4496.2.patch
>
>
> HDFS deployments can currently use the {{RequestHedgingProxyProvider}} to 
> improve Namenode failover detection in the client. It does this by 
> concurrently trying all namenodes and picks the namenode that returns the 
> fastest with a successful response as the active node.
> It would be useful to have a similar ProxyProvider for the Yarn RM (it can 
> possibly be done by converging some the class hierarchies to use the same 
> ProxyProvider)
> This would especially be useful for large YARN deployments with multiple 
> standby RMs where clients will be able to pick the active RM without having 
> to traverse a list of configured RMs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-4496) Improve HA ResourceManager Failover detection on the client

Reply via email to