[ 
https://issues.apache.org/jira/browse/YARN-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107685#comment-15107685
 ] 

Jian He commented on YARN-4496:
-------------------------------

Uploaded a patch:
- added a new RequestHedgingRMFailoverProxyProvider.  When client tries to 
failover, it uses separate proxy object to talk to each RM simultaneously , 
each proxy retries the RM until the first one receives a response from the 
active RM. All the other requests are then cancelled.
- changed the default rm-retry-interval to be 5 seconds, 30 seconds interval I 
think is too long.

> Improve HA ResourceManager Failover detection on the client
> -----------------------------------------------------------
>
>                 Key: YARN-4496
>                 URL: https://issues.apache.org/jira/browse/YARN-4496
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: client, resourcemanager
>            Reporter: Arun Suresh
>            Assignee: Jian He
>         Attachments: YARN-4496.1.patch
>
>
> HDFS deployments can currently use the {{RequestHedgingProxyProvider}} to 
> improve Namenode failover detection in the client. It does this by 
> concurrently trying all namenodes and picks the namenode that returns the 
> fastest with a successful response as the active node.
> It would be useful to have a similar ProxyProvider for the Yarn RM (it can 
> possibly be done by converging some the class hierarchies to use the same 
> ProxyProvider)
> This would especially be useful for large YARN deployments with multiple 
> standby RMs where clients will be able to pick the active RM without having 
> to traverse a list of configured RMs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to