[
https://issues.apache.org/jira/browse/YARN-6128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236470#comment-16236470
]
Subru Krishnan commented on YARN-6128:
--------------------------------------
Thanks [~botong] for the patch. Please see my comments below:
* General nit: many public methods are lacking Javadoc, especially in
{{FederationRegistryClient}}.
* Why are using both synchronized and concurrent HashMap in
{{FederationRegistryClient}}? What's the access pattern as I am wondering if
read-write locks might be more efficient?
* Nit: check the log levels as we should be using info only for actions. For
e.g.: we can debug when there's no update in
*FederationRegistryClient::writeAMRMTokenForUAM*.
* Can we load the entire state at one shot in
*FederationRegistryClient::loadStateFromRegistry* instead of multiple looping?
* I don't understand the need for _ignoreMemoryState_ (relates to first comment
about Javadoc)?
* In {{UnmanagedAMPoolManager}}, we are always setting
_setKeepContainersAcrossApplicationAttempts_, is that intended or should it
only be done when HA or restart is enabled?
* I don't grok why are we additionally persisting the _Credentials_ in
{{AMRMProxyApplicationContext}}. Where's it being consumed in the patch?
* I am not able to follow
*FederationInterceptor::reAttachUAMAndMergeRegisterResponse* invocation.
Shouldn't this be done during recovery? and why are we invoking this right
after register?
> Add support for AMRMProxy HA
> ----------------------------
>
> Key: YARN-6128
> URL: https://issues.apache.org/jira/browse/YARN-6128
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: amrmproxy, nodemanager
> Reporter: Subru Krishnan
> Assignee: Botong Huang
> Priority: Major
> Attachments: YARN-6128.v0.patch, YARN-6128.v1.patch,
> YARN-6128.v1.patch, YARN-6128.v2.patch, YARN-6128.v3.patch, YARN-6128.v3.patch
>
>
> YARN-556 added the ability for RM failover without loosing any running
> applications. In a Federated YARN environment, there's additional state in
> the {{AMRMProxy}} to allow for spanning across multiple sub-clusters, so we
> need to enhance {{AMRMProxy}} to support HA.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]