[ 
https://issues.apache.org/jira/browse/YARN-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236470#comment-16236470
 ] 

Subru Krishnan commented on YARN-6128:
--------------------------------------

Thanks [~botong] for the patch. Please see my comments below:
* General nit: many public methods are lacking Javadoc, especially in 
{{FederationRegistryClient}}.
* Why are using both synchronized and concurrent HashMap in 
{{FederationRegistryClient}}? What's the access pattern as I am wondering if 
read-write locks might be more efficient?
* Nit: check the log levels as we should be using info only for actions. For 
e.g.: we can debug when there's no update in 
*FederationRegistryClient::writeAMRMTokenForUAM*.
* Can we load the entire state at one shot in 
*FederationRegistryClient::loadStateFromRegistry* instead of multiple looping?
* I don't understand the need for _ignoreMemoryState_ (relates to first comment 
about Javadoc)?
* In {{UnmanagedAMPoolManager}}, we are always setting 
_setKeepContainersAcrossApplicationAttempts_, is that intended or should it 
only be done when HA or restart is enabled?
* I don't grok why are we additionally persisting the _Credentials_ in 
{{AMRMProxyApplicationContext}}. Where's it being consumed in the patch?
* I am not able to follow 
*FederationInterceptor::reAttachUAMAndMergeRegisterResponse* invocation. 
Shouldn't this be done during recovery? and why are we invoking this right 
after register?

> Add support for AMRMProxy HA
> ----------------------------
>
>                 Key: YARN-6128
>                 URL: https://issues.apache.org/jira/browse/YARN-6128
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: amrmproxy, nodemanager
>            Reporter: Subru Krishnan
>            Assignee: Botong Huang
>            Priority: Major
>         Attachments: YARN-6128.v0.patch, YARN-6128.v1.patch, 
> YARN-6128.v1.patch, YARN-6128.v2.patch, YARN-6128.v3.patch, YARN-6128.v3.patch
>
>
> YARN-556 added the ability for RM failover without loosing any running 
> applications. In a Federated YARN environment, there's additional state in 
> the {{AMRMProxy}} to allow for spanning across multiple sub-clusters, so we 
> need to enhance {{AMRMProxy}} to support HA.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to